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Preface 

As a student of physics I always felt that the difficulties I had with comprehending the ex- 
plained theories were mostly due to my incapability. Pondering on questions like "What is an 
electric field?" somehow prevented me from actually solving Maxwell's equations, which is in 
fact the thing that you have to do to pass your exam. But then working out the details of the 
actual solving leads to various difficulties again. Indeed, quite often in physics one encounters 
mathematical problems which one must march over to obtain the desired answer. Sometimes 
this results in peculiarities that seemed paradoxical to me like a discontinuous solution to a 
differential equation. As it turns out, I'm one of those persons who in many cases can't see 
the bigger picture until he's worked out a lot of the details. Fortunately, working out details 
is an activity praised in mathematics (the unfortunate thing for me was that it took me over 
four years to discover this). 

When I first came with the idea for this thesis I didn't know very much about the funda- 
ments of quantum mechanics. Discussions on topics like "hidden variables", "contextuality" or 
"locality" always seemed gladly avoid by the teachers during the lectures on quantum theory. 
It was kind of a revelation (and a relief) for me to find out that the incomprehensibility of 
quantum mechanics is not easily stepped over. This became clearer to me when I followed a 
course on quantum probability taught by Dr. Maassen. One of the first topics discussed, was 
the violation of Bell inequalities by quantum mechanics, which demonstrates directly that the 
probability measures obtained in quantum mechanics are fundamentally different from those 
described by Kolmogorov's theory. This is interesting in particular for probability theorists 
but it wasn't the aim of Bell to advocate for a revision of probability theory. Rather, it was his 
aim to show that any hidden-variable theory that reproduces the predictions obtained from 
quantum mechanics must be non-local, i.e. it requires action at a distance. 

Roughly, a hidden- variable theory is a theory that allows a realist interpretation i.e., a 
theory in which observables can be interpreted to correspond to properties of systems that 
actually exist. Personally, I never considered that this should be taken as a demand for 
physical theories. Not because I have a strong opinion considering the realist /idealist question 
in philosophy, but more because I never considered it the task of physics to be judgmental 
about such philosophical problems. However, there were enough questions raised in my head 
to form a starting point for this thesis and luckily, none of them have been answered properly. 
These questions include the following. How can mathematics play a role in finding answers 
to metaphysical questions? How reliable is mathematics in this role? Why does quantum 
mechanics not allow (certain) realist interpretations? 

As it goes with such questions, trying out answers immediately leads to new questions. In 
particular it becomes of interest what the role of mathematics is in physics and even broader, 
what the nature of mathematics is in itself, i.e. what is mathematics actually about? Con- 
cerning this first problem I became particularly interested in probability theory, which, in 
my opinion, is one of the purest forms of physics^ remember a lecture during a course on 
statistical physics taught by Prof. Vertogen during which he gave a derivation of the notion 
of entropy from a Bayesian point of view making only use of philosophical and logical con- 
siderations (i.e., without resorting to the measure-theoretic approach). At that time I didn't 

1 This view doesn't seem very popular, but it is in fact in correspondence with Hilbert's vision who explained 
his sixth problem as follows: "To treat in the same manner, by means of axioms, those physical sciences in 
which mathematics plays an important part; in the first rank are the theory of probabilities and mechanics" 
[Hil2] . 
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recognize it as such, but it did make me realize how important logic is for the construction 
of physical theories. Unfortunately for me, at the same time I also followed a course on logic 
thought by Dr. Veldman. The unfortunate coincidence was that while Prof. Vertogen made 
extensive use of the logical law -i(-iX) = X (it was actually the first formula in the accompa- 
nying reader), Dr. Veldman was advocating against the use of this law, which is abandoned 
in intuitionistic logic. Needless to say, I had my logical conceptions all mixed up and I ended 
up failing the exams for both the courses. 

Ever since I've had a sort of love-hate relationship with intuitionism. At first I didn't like 
it all and I tried to find a motivation for myself to see why the law of excluded middle should 
be true. From a physics point of view, all the motivations I could find were based on realism, 
which seemed to me to be a too strong assumption. On the other hand, it seemed to me that 
if one can doubt one specific logical law, one might as well doubt all of logic. This is also what 
Brouwer advocated and roughly, he proposed that not logic should be our guide to truth, but 
intuition. It never became clear to me why this approach would be more reliable when it comes 
to truth. But at least logic enables us to compare notes in an (almost) unambiguous way. I 
came to except logic as a tool for reasoning not because I believe it is true, but because I don't 
see a better alternative. Then what about the law of excluded middle? I think from a realist 
(or Platonist when it comes to mathematics) point of view it may be mandatory. Others may 
want to learn to use it with care. For me, sometimes its truth seems almost evident]^] while on 
other occasions it seems very suspicious (and the same holds for the axiom of choice for that 
matter). And as for truth, perhaps truth is overrated. 

I wouldn't have found a personally satisfactory view on physics and mathematics if it 
weren't for the aforementioned persons. In fact, I probably would have quit my study within 
the first three years without the down toned visions on physics Prof. Vertogen presented in 
his lectures and I would like to thank him for that. I also would like to thank Prof. Landsman 
and Dr. Maassen for making me enthusiastic about mathematical physics and Dr. Veldman 
for teaching me about the philosophy of mathematics and intuitionism in particular. Without 
these people I would never have guessed it to be possible to write a philosophical thesis on 
physics with the use of mathematical rigor that still makes sense. For this I must also thank 
Dr. Seevinck who taught me a lot about the foundations of physics and who was often willing 
to listen to my own ideas. Finally I'd like to thank my girlfriend Femke for supporting me in 
every step the past ten years and for being my philosophical sparring partner from time to 
time and above all for being my best friend. 

Nijmegen, October 2009 



2 I think this also must have been the case for Brouwer for I see no better way to motivate his continuity 
principle. 
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1 Introduction 

Theoretical physicists live in a classical world, looking 
into a quantummechanical world. 

- J. S. Bell 

Quantum mechanics started as a counter- intuitive theory and has succeeded in preserving 
this status ever since. Most introductions to quantum mechanics start with Planck's radiation 
formula. This was the first formula that relied on the quantization of energy, which is a 
departure from the "Natura non facit saltus'-principle. Proposing the theoretical interpretation 
of this radiation formula was described by Planck himself as an act of despair: 

"Kurz zusammengefasst kann ich die ganze Tat als einen Akt der Verzweiflung 
bezeichnen, denn von Natur bin ich friedlich und bedenklichen Abenteuern ab- 
geneigt aber eine Deutung musste um jeden Preis gefunden werden, und ware 
er noch so hoch. ... Im iibrigen war ich zu jedem Opfer an meinen bisherigen 
physikalischen Uberzeugungen bereft." |Pla| 

The act of despair in question was actually not the concept of allowing discontinuity in Nature 
(although this was related) but the reliance on Boltzmann's theory of statistical physics, which 
is based on atomism; i.e. the idea that all matter is made up of some smallest particles called 
atoms. Nowadays, the concept of atomism is part of the doctrine of natural science and nobody 
would question the existence of atoms. However, around 1900 there was no real consensus 
about the issue, and in fact Planck originally opposed it. 

In his later years, Planck came to accept the concept of atomism and thus conquered one of 
the (to him) counter-intuitive aspects of his radiation formula, and thus quantum mechanics. 
However, this acceptance took a large revision on what is to be expected of Nature and on 
what is to be expected of a physical theory. It seems that this has been characteristic for 
the discussion on the foundations of quantum mechanics ever since. Some of the revisions 
that have been proposed throughout the years will be discussed in this thesis. These include 
revisions of our view on: reality, causality, locality, free will, determinism and logic. Not 
the lightest of subjects, and it seems mind-boggling enough that quantum mechanics has led 
people to such considerations. 

An important motivation for the entire discussion is the search for an answer to the ques- 
tion: "What is actually being measured when a measurement is performed?" In classical 
physics there seemed to be an easy answer to this question; a measurement reveals some prop- 
erty possessed by the system under consideration. This is, roughly, the realist interpretation 
of physics. However, as it turns out, such an interpretation is not possible in quantum me- 
chanics without making compromises. The proof of this statement is usually attributed to the 
Kochen-Specker Theorem and the violation of the Bell inequalities by quantum mechanics, 
which will be discussed in Chapter [2] Both imply a compromise that has to be made if one 
wishes to maintain realism. The Kochen-Specker Theorem implies that one has to resort to 
contextualit}|^] and the Bell inequality argument implies that one has to resort to non-locality. 

3 Of all the philosophical concepts that play a role in this thesis, this is probably the most peculiar one. 
Roughly, it states that what is actually measured depends in a very strong sense on how it is measured i.e., it 
depends on the measuring context. Of course, each of these concepts will be explained more carefully in the 
course of this thesis. 
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Introduction 



Most people do not wish to make such compromises, and therefore these proofs are also known 
as 'impossibility proofs' or 'no-go theorems'. 

There is a natural problem that arises in this situation. The statements derived are all 
of a philosophical nature, but on the other hand, rigorous proofs can only be made within 
mathematics. This is because when it comes to mathematical objects, most people agree on 
how these objects may be manipulated to obtain new objects]^] However, this also means that 
philosophical and mathematical concepts have to be linked to one another, and there is of 
course no rigorous way to do this. In fact, there is not even much consensus about what terms 
like reality, locality and free will mean and what role they should play in a physical theory. 
This leaves a lot of room for discussion on what actually can be proven about Nature and 
about physical theories in particular. 

In Chapter [3] it will be shown that the Kochen-Specker Theorem is quite unstable consid- 
ering speculations on what realism should entail. More specifically, it turns out that if one 
relaxes the view on what constitutes a physical observable, one may retain non-context uality. 
The discussion becomes more philosophical in Chapter |4j where the Free Will Theorem of 
Conway and Kochen is discussed. This is also the first point where it becomes more clear that 
the strangeness of quantum mechanics does not just affect the realist interpretation of physics; 
the indeterminacy introduced by quantum mechanics seems unavoidable in any other proceed- 
ing theory (consistent with current experimental knowledge), irrespective of whether one has 
a realist or an anti-realist view. This leads to the question whether the earlier arguments used 
to point out problems in the realist interpretation can be extended to also point out problems 
that arise in other interpretations. In Chapter [5] it is argued that this does indeed seem to be 
the case. 

Hoping to acquire a better understanding of these problems, I take on a short re-investigation 
of the Copenhagen interpretation. It seems that the Copenhagen interpretation does pro- 
vide certain conceptual tools to overcome some philosophical problems concerning quantum 
mechanics. However, most people, like myself, cannot help to feel some unease about this 
interpretation. This feeling is similar to the one sometimes encountered when studying a 
mathematical theorem; although the proof may convince one that the theorem should be 
true, it doesn't always provide the feeling that one understands what the theorem actually 
states. Often, a clarification is needed to explain why a theorem is stated the way it is, and 
what the idea behind the theorem is. 

Such a clarification appears to be missing for the Copenhagen interpretation. Bohr only 
recites some facts about the strangeness of quantum mechanics (at least, the facts as he sees 
them) and suggests how one should cope with them. The impossibility proofs show that these 
facts cannot easily be sidestepped and so indeed it seems that one must cope with them. 
However, structural philosophical arguments about what accounts for this strangeness are 
missing. There is no clear motivation for coping with the problems in the way suggested by 
Bohr. In Chapter [5] I will attempt to provide this motivation, by linking the philosophy of 
Bohr to some of the philosophical ideas behind intuitionistic logic. More precisely, my hope is 
that an abuse of language in the sense meant by Bohr, may be avoided by adopting a different 
form of logic. In particular, it seems that the law of excluded middle provides a devious way 
to introduce sentences that speak of phenomena that cannot be compared with one another. 



4 It may be noted that there is no general consensus on what these objects are. However, in many cases 
this doesn't influence what is considered a proof and what is not. 



Introduction 3 

It is clear that Bohr wanted to avoid such sentencea^J and should have argued against this 
law, hence embracing intuitionistic logic. However, he insisted on the use of classical logic. 



5 In the words of Wittgenstein: "Wovon man nicht reden kann, dartiber muss man schweigen." 
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2 Introduction to the Foundations of Quantum Mechanics 



2.1 Postulates of Quantum Mechanics 

The goal of this section is to give a brief introduction to (some of) the counter-intuitive 
aspects of quantum theory and to show why they can't be resolved as easily as one might 
hope (namely, due to the impossibility proofs for hidden- variable theories). First, (a version 
of) the postulates of quantum mechanics, as originally introduced by von Neumann |vN| . is 
formulated. The version I use here is the one that was presented to me by Michael Seevinck 
in a course on the foundations of quantum mechanics |See2| . Although probably most readers 
already know these postulates in some form, I think it is good to restate them to give a more 
complete overview. Moreover, it will make the discussion more precise, since there will now 
be less ambiguity on what I mean when I refer to a particular postulate^] More than once 
I found myself in a situation when I had an objection to some argument used in a text on 
foundations of quantum mechanics, only to find out that I was actually objecting to one of 
the postulates in a different form. Also, it seems a good occasion to introduce the notation 
used throughout this thesis. 

1. State Postulate: Every physical system can be associated with a (complex) Hilbert 
spacdJJ'H. Every nonzero vector ip 6 TL gives a complete description of the state of the 
system. For each A G C, A / 0, the two vectors ip and describe the same state. If two 
systems are associated with spaces %\ and %2i then the composite system is described 
by the space Hi (g) H.2- 

A more generalized notion of the state of a system is given by the language of density operators. 
The states in the form of vectors in a Hilbert space are then called pure states. Notice that 
each pure state in fact corresponds with an entire 'line' (^ip)xec\{0} m %i called a ray. To 
each ray one associates the projection : TL — > TL on this line, given by 



With a mixture of pure states one can then associate a convex combination of one-dimensional 
projection operators]^] More formally, a mixed state is a positive trace-class operator with trace 
1. The set of mixed states will be denoted by S(TL), and the set of pure states by V\ (TL) (which 

6 This also holds more generally; a lot of confusion may be avoided if more authors took the time to restate 
important terms in their discussion. 

7 In our definition of a Hilbert space, the inner product will be linear in the second term and anti-linear in 
the first. 

8 The interpretation of mixed states as an actual mixture of pure states is not entirely without problems. 
For example, two mixtures of different pure states may constitute the same mixed stated. Therefore, a mixed 
state doesn't give complete information about the pure states of which it is a mixture. Moreover, interpreting 



There is a theory which states that if ever anyone discov- 
ers exactly what the Universe is for and why it is here, 
it will instantly disappear and be replaced by something 
even more bizarre and inexplicable. There is another 
which states that this has already happened. 



- D. Adams 
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stands for the set of one-dimensional projections). The set S{%) is a convex set. One may 
show that the set Vi(H) corresponds to the set of all extreme points of S(7i) (i.e., the one- 
dimensional projections are precisely all the elements of S{%) that cannot be written as a 
proper convex combination of other elements). 

2. Observable Postulate: With each physical observable A, there is associated a self- 
adjoint operator A : D{A) — > % with domain D(A) dense in H. 

The theory of self-adjoint operators is noticeably more complex than most physics literature 
would lead one to believe, as is made clear in the following example. 



Example 2.1. Consider the Hilbert space H = L 2 (R), the space of all square integrable 
functions. The position operator defined by (Xip){x) := xtp(x) does not map every ip G T~L 
to an element of H. Therefore, the set T~L cannot be taken as its domain but instead, 
one must take some dense subset D(X). Its adjoint operator X* is defined as the unique 
operator that satisfies 

{i>,Xct>) = {X*^,4>), V^fl(r),^fl(I), (2) 

where the domain of X* is defined as 

D{X*) := {tp G H ; (f> ^ (if;, X<p) is a bounded linear functional V0 G D(X)}. (3) 

Intuitively, the larger one chooses D(X), the smaller D(X*) becomes. It is therefore a 
delicate matter to choose D(X) in such a way that X is self-adjoint, i.e., (X, D{X)) = 
(X*,D(X*)). It turns out that X is self-adjoint on the domain D(X) = {if; G % ; Xtp G 

Note that for the specification of this domain it is necessary that X^ is actually 
defined for all tp. For X this causes no problems, but for the momentum operator P, 
the expression (Pip)(x) = —ih-^ip(x) is not well-defined for most elements of ~H without 
introducing the notion of a generalized function (also called a distribution). First one 
introduces an injection tp i— > of % into the set of all linear functionals on the vector 
space C^°(IR) (i.e. the set of all infinitely differentiable function with compact support): 

L^):=(^M>, V<AGC C °°(M). (4) 
On this sspace of linear functionals, one defines the derivative as 

^(<A) := " (V>, ^/>) , V0GC c -(M). (5) 

Now for any ip G 7i one takes the condition G ?^ to mean that there exists a x G 
such that ^r-Lip = L x . In this case, one defines = %■ m particular, if ip is differentiable 
(in the usual sense) with derivative in Ti one has 

±L4 ( p):=-(^,^ = ^A)=L^), W>eCTW, (6) 



the mixed states as 'not actually knowing what the pure state is' leads to problems when considering composite 
states, since a mixed state is in general a convex combination of pure states of the form P^ 1 ^ 2 - Such states 
are known as proper mixtures, other mixtures are called improper. This terminology is due to d'Espagnat. 
See also [dH]. 
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so that this new notion of a derivative is a proper extension of the usual one. It then turns 
out that the momentum operator P is self-adjoint on the domain 

D(P) := {ip G AC(R) ; P^ G H}, (7) 

where AC(K) is the set of all functions that are absolutely continuous^] on each finite 
interval of R. A proof can be found in |Yos| p. 198]. In this book one can also find a proof 
of the peculiar fact that there is no self-adjoint momentum operator on the Hilbert space 
L 2 [0,oo) (p. 353). A friendly text on these problems that also emphasizes the relevance 
for physics and chemistry is [BFVJ. 



These are examples of self-adjoint operators that play a large role in the theory of quantum 
mechanics. However, most self-adjoint operators don't play any role in quantum theory. One 
may, for example, consider the operator X + P on the Hilbert space L 2 [0, 1]. In this case, the 
operator X is self-adjoint on the domain D(X) = %. The momentum operator is self-adjoint 
on the domain 

D(P) = Up G H ; ip is absolutely continuous, -^-^^ £ W,tp{0) = VK 1 ) j • ( 8 ) 

It follows from the Kato-Rellich theorem (see for example |dO[ Ch. 6]) that X + P is self- 
adjoint on the domain D(X + P) = D{P). 

Though self-adjoint, the operator X + P has no direct physical meaning. But even an 
indirect meaning (e.g. by adding the measuring results of position and momentum) is am- 
biguous, since one cannot measure both observables at the same time (because X and P do 
not commute). This leads one to question the converse of the observable postulate, i.e. the 
claim that every self-adjoint operator corresponds with an observable. It seems reasonable 
to deny this claim. On the other hand, it seems premature to exclude some operators (like 
X + P) a priori, since it cannot be excluded that a meaning for such an observable will be 
found in the future. 

For a bounded operator Aona Hilbert space % its spectrum is defined as the set 



a(A) := {a G C ; A - at is invertible}. (9) 

For an unbounded operator A with domain D(A) dense in H the spectrum o~(A) can still be 
defined. In this case a G o~(A) if and only if there exists a bounded operator B such that 
(A — al)B = 1 and B(A — al)^ = ip for all ip G D{A). The spectrum is a generalization 
of the notion of the set of eigenvalues of a matrix. As with matrices, the spectrum of a self- 
adjoint operator is always a subset of the real numbers. This physically justifies the following 
postulate.^ 

9 A function ip is called absolutely continuous on the interval I if for each e > there exists a 8 > such 
that for each finite sequence (a n ,b n ) of pairwise disjoint open sub-intervals of /, one has ^ n \t/j(b n ) — tp(a n )\ < e 
whenever J2 n l&» ~ a "l < 

10 This postulate is often seen as a part of the Born postulate. Indeed, the Born postulate implies that 
a measurement result almost surely (i.e., with probability one) is a value in the spectrum of A. The value 
postulate sharpens this by stating that measurement results outside <j(A) can actually never be obtained. 



Postulates of Quantum Mechanics 



7 



3. Value Postulate: A measurement of a physical observable A yields a real number in 
the spectrum of the associated self-adjoint operator A. 

The value postulate does not make any statement about the actual result of a specific 
measurement. To elaborate on this postulate, one of the wonderful results of functional 
analysis is needed: the spectral theorem]^] 



Theorem 2.1. For each densely-defined self-adjoint operator A, there is a spectral measure 
HA such that 

(i) A = J K z d Ha{z) (as a Stieltjes integral). 

(ii) If An a (A) = 0, then /m(A) = for each Borel set A. 

(Hi) For each open subset U cR with U n o~(A) ^ 0, one has ha{U) ^ 0. 
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(iv) If B is a bounded operator such that BA C AE 15 then B also commutes with /i^(A) 
for every Borel set A. 

This mathematically justifies the following postulate. 

4. Born Postulate: The probability of finding a result a £ A upon measurement of the 
observable A on a system in the state ip for a Borel set A is given by 

P^[ieA] = ^(i w (A^. (10) 



The notation Pv,[.4 £ A] is probably more familiar to probability theorists than to physicists, 
but I think it is a convenient one. It is to be read as the probability for the event A £ A, 
given the state ip. Similarly, I write 



E,U) / :IYU-{(1:H / J m { 1 p, z , iA (dz) 1 p) = ^^- (11) 



for the expectation value, instead of the often seen notation (A)^p. Note that it is more common 



to take the right-hand side of (11) as the definition of the quantum-mechanical expectation 



value. Its relation with the probabilities for measurement results (the left hand side of (11)) 



may then be seen to be a consequence of the spectral theorem (i.e., according to this theorem 



(11) is equivalent to (10)). 



In the generalized case where mixed states are considered, the Born rule generalizes to 

F p [A € A] = Tr(p/ M (A)) (12) 
for the mixed state p, where Tr denotes the trace operation. One easily checks that this results 



in (10) in case that p = Pw, (i.e., whenever p is a pure state). 



11 See for example |Con2) or |Rud) for proofs. 

12 A spectral measure is a map /j, from the Borel subsets B of H to the projection operators V(H) such that 
jti(0) = 0, p(R) = 1, /ti(Aj PI A2) = /«(Ai)/i(A2) VAi, A2 G B and for each countable set of disjoint subsets 
(A0£i in B one has ^(U£ x Ai) = £~i p(A<). 

13 This means that D{BA) C D(AB) and ABip = BAtp for all tjj G D(BA). 
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Remark 2.1. Observables associated with projection operators (in particular, those of the 
form /Ua(A)) are usually regarded as the "yes-no"-questions. Since their spectrum is {0, 1}, 
a measurement of such an observable always yields one of these numbers. For an observable 
V associated with the operator P, the number 1 corresponds to the answer "The state of 
the system after the measurement lies in the space PH." and the number corresponds to 
the answer "The state of the system after the measurement lies in the space (1 —P)H." In 
particular, a measurement of the observable associated with an operator of the form p A (A) 
can be associated with the question "Does the value of A lie in the set A?" The precise 
meaning of these questions (and their answers) will play an underlying role in this thesis. 

Many physicists may find this use of mathematics overwhelming and maybe even unneces- 
sary. In most physics literature one simply refers to the spectral decomposition of an operator 
without explicitly defining what this means. Operators are often treated as if they are matrices 
and their spectra are then referred to as the set of eigenvalues with corresponding eigenstates 
and eigenspaces. As a student of physics I became confused when first realizing that, for 
example, the position operator X does not have any eigenstates. Moreover, it didn't become 
clear to me why the probability of finding a particle in some (Borel) subset A was given by 

F lp [XeA] = ^jji;(x)\ 2 dx, (13) 



until I learned (in a mathematics course) that the spectral measure for the position operator 



is simply given bjPUx(A)V> = IaiP (because a(X) = K), so that p3J is a special case of 



(10) 



Finally, it has to be specified how the state of the system changes in time. Actually, two 
postulates are needed for this. 

5. Schrodinger Postulate: When no measurement is performed on the system, the 
change of the state in time is described by a unitary transformation. That is, 

m = u(t)ii>(o) 

for some strongly continuous unitary one-parameter group 15 ] 1 1— > U(t). 



Note that no distinction in notation is made between the map tp : R — > H, t i— )■ ip{t) and the 
vector ifi in 7i as is standard in most literature. This postulate is in fact equivalent to the one 
found in more standard physics literature. Namely, because of Stone's Theorem there exists 
a self-adjoint operator H such that U(t) = e~ iHt Vi, which brings one back to the original 
Schrodinger equation 

at 

For a mixed state p the time evolution is given by p(t) = U(t)pU*(t), or i = [H, p(t)]. 

6. Von Neumann Postulate (Projection Postulate): When an observable A cor- 
responding to an operator A with discrete spectrum is measured and the measure- 
ment yields some a 6 cr(A), the state of the system changes discontinuously from ip to 
p, A ({a})ip. 



14 Here, 1a denotes the indicator function for the set A. 

15 This means that the set {U(t) ; t £ K} forms a group of unitary operators satisfying U(t + s) = U(t)U(s) 
Vs, t, where the map 1 U(t) is continuous in the sense that lim s _t.t U(s)ip — U (t)ip for all t € R, ip 6 H- 
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Note that the state of the system after measurement is indeed always a state (i.e. /x^({a})V' 7^ 
0) because of the Born postulate. 

The motivation for introducing this postulate is that it ensures that if a measurement of 
an observable yields some value a, an immediate second measurement of the same observable 
will yield exactly the same result. It is founded on experimental experience (von Neumann 
based it on the Compton-Simons experiment |vN| ) and therefore seems a necessary claim. 
However, the postulate as stated here only applies for discrete observables (i.e., those whose 
corresponding operators have a discrete spectrum). It has, in fact, been shown that a similar 
postulate for continuous spectra cannot be formulated: repeatable measurements are only 
possible for discrete observables [OzaJ. A more extensive discussion can be found in jBLMj. 

The von Neumann postulate is probably the most controversial postulate of quantum me- 
chanics. Because the time evolution of the state of a system depends so greatly on whether 
or not there is a measurement being performed on the system, one is tempted to ask what 
exactly constitutes a measurement. No satisfactory answer to this question exists in my opin- 
ion, and it is one of the underlying questions of what is known as "the measurement problem" 
(see also |Bel5| ). Compared to the difficulty of this problem, the problem of repeatability for 
observables with a continuous spectrum seems a rather small one. It seems likely to me that 
a philosophically satisfying solution of the measurement problem may also solve the latter. 16 

Although it certainly is an interesting topic for research, the measurement problem will 
not be the focus of this paper. My problem is rather related to one of the earliest objections 
against quantum mechanics made clear for the first time by Einstein, namely its possible 
incompleteness . 



2.2 The (In) completeness of Quantum Mechanics (Part I) 

It follows from the Born postulate that, in general, the state of the system does not determine 
what the result of the outcome of a measurement is. This in itself was not the only problem 
Einstein had with quantum theory. An even more serious problem for him was that certain 
observables like energy and momentum, which are even supposed to be conserved, are not 
attributed a particular value at all in quantum mechanics. That is, if one knows the state of 
the system, one cannot in general say what the momentum of a particle in the system is. In 
[EPRJ Einstein, Podolsky and Rosen also gave a seemingly convincing reason why a complete 
theory should attribute a value to such observables at all times. Below, an experiment is 
discussed that is quite similar to the one discussed in [ EPR| . but has the advantage that there 
are no unbounded operators involved. It was first introduced by Bohm |BA| and has played 
a central role in many discussions ever since. 



Example 2.2 (The EPRB-experiment). Consider a system of two spin-^ particles (say, 
two electrons). Each particle taken by itself can be described in a Hilbert space C 2 , where 
a basis is choosen such that (1,0) stands for spin up and (0, 1) for spin down. Physicists 



16 There have been proposals for introducing generalizations of the von Neumann postulate that are rich 
enough to incorporate observables with a continuous spectrum. However, there are great consequences involved. 



In the scenario described in [Sri] it requires a generalization of the state postulate to a point where the original 
states (the density operators) are only associated with the probability functions they generate through the 



Born postulate. The generalized states also allow probability functions that are no longer of the form ( 12 \ and 
are no longer <r-additive in general. 
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would use | t) and | ^) to denote those basis vectors. Traditionally the spin is considered 
along the z-axis and the corresponding observable for the spin along this axis is 



rr,: {J _°) (Jia) 



For the x and y axes one has 



Consequently, for a measurement of the spin along the axis 

r = (cos(i?) sin(( / j), sin(i?) sin(^), cos(<^)) (15) 

one has the observable 

o r := cos(i?) sin(ip)a x + sin(i9) s'm(ip)a y + cos((p)a z 

cos(i^) cos(i?) sin(<^) — i sin(i?) sin(y?)\ , , 

cos(i?) s'm(ip) + i sin(i9) sin(tp) — cos(ip) J 

= P r + Pr—i 

where P r + = 5(1 +oy) and P r _ = i(l —a r ) are one-dimensional projections (this is easily 
checked using <r 2 = 1). Thus, the projection P r+ corresponds to the question if one 
will find spin up along the r-axis. Let's denote the corresponding observable by V T (see 
Remark 



2.1 ). For a spin-i particle in a state ift it then follows that 



P^[V r = 1] = (^,P r+ ^), P^[P P = 0] = (1 -P r+ )^) = (^r-V>>- (17) 

The combined system is then described by the Hilbert space C 2 <8> C 2 ~ C 4 where the 
following connection is made between the two descriptions of this space: 

lA m 1,0,0,0) (=|tt» (^] ® (T\ = (0,1,0,0) (= I tl)) 



0/ V / V / V 1 / 

?) (o) = (0 ' °' 1; 0) (= 1 ir)) (?) ® (?) = (0 ' °' °' 1} (= 1 u)) 

An observable A corresponding with the operator A for the one-particle system extends to 
an observable for the first particle (the one 'on the left') in the combined system with the 
operator A(g)l. For the second particle A extends to the operator 1 (gA. Such observables 
always commute, since 

(A®1)(1®B)=A®B = (1®B)(A®1). (19) 

Therefore, one can simultaneously perform measurements on particle one and on particle 
two. Also, note that for two projections P± and P2, the operator Pi ® P2 is again a 
projection. Now suppose the system is prepared in the state 

V> = -^(0,1,-1,0) (= ±= (| n> - lit))) ■ (20) 
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If the spin along the z-axis for the first particle is measured one finds that the probabilities 
for finding spin up or spin down are respectively 



The reasoning of Einstein, Podolsky and Rosen now is as follows. According to the von 
Neumann postulate, after the measurement the state of the system will be (0, 1, 0, 0) upon 
finding spin up and (0, 0, 1, 0) upon finding spin down. In either case, a measurement of 
the spin along the z-axis on the second particle will yield a particular result (the opposite 
of the spin of the first particle) with absolute certainty. Since one can make a prediction 
of the spin of the second particle without in any way disturbing this particle (the distance 
between the two particles may be arbitrarily large) one may state that the spin along the 
z-axis has a real meaning. That is, the spin along the z-axis appears to be an observable 
that is actually meaningful for the observed system. Einstein, Podolsky and Rosen would 
say that there exists an element of physical reality that corresponds to this observable. 

Now, if one would measure the spin along the y-axis on particle one instead, the state 
of the system would be projected to the state (—1, 1, —1, 1) if one would find spin up, and 
to the state (1, 1, —1, —1) if one would find spin down. Each happens with probability ^. 
In either case, a measurement of the spin along the y-axis on the second particle yields 
a particular result (the opposite of the spin of the first particle) with absolute certainty. 
Following the same reasoning, one concludes that also the spin along the y-axis of the 
second particle should correspond to an element of physical reality. 

Now the problem is the following: in the states (0, 1, 0, 0) and (0, 0, 1, 0) one can assign 
a value to the spin along the z-axis for the second particle, but the spin along the y-axis 
does not have a definite value. Vice versa for the states ( — 1, 1, —1, 1) and (1, 1, —1, —1). It 
turns out that there is no state that can assign a definite value to both the observables at 
the same time and hence there is no state in quantum mechanics that can give a complete 
description of the system, because there is always at least one observable that has no 
definite value in that state. 



The standard literature uses the terminology of Einstein, Podolsky and Rosen, which is 
more formal. The crucial terms they use are (quotations are taken from jEPRj): 

• EOPR: "If, without in any way disturbing a system, we can predict with certainty (i.e., 
with probability equal to unity) the value of a physical quantity, then there exists an 
element of physical reality [EOPR] corresponding to this physical quantity." 

• Comp: A necessary condition for the completeness of a physical theory is that "every 
element of the physical reality must have a counterpart in the physical theory." 

• Loc: The performance of a measurement on a physical system does not have an instan- 
taneous influence on elements of the physical reality that are located at some distance 
of this system. 




(21) 
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In these terms the example above now reads as follows. Since without in any way disturbing 
the second particle (because of Loc) either its spin along the y-axis or along the z-axis can be 
predicted, both these observables must correspond to elements of the physical reality (EOPR). 
Since no state of the system can describe simultaneously the values of both these observables, 
the theory of quantum mechanics is not complete (because of Comp). 

These terms may sound a bit metaphysical, if only because they hinge upon a particular 
definition of physical reality. However, the argument still holds if one takes the criteria of 
completeness not to be about physical reality, but about possible observations, eliminating 
the objections that instrumentalists (or idealists) may have against this argument. One could 
argue that a physical theory should be local in the sense that, if one can make predictions 
about observables of system 1 by performing measurements on some system 2 separated from 
system 1, the theory should be able to make those predictions also without the use of system 2. 
In addition, the theory is complete if, in this situation, it actually does make these predictions. 
This seems at least sensible. 

Bohr, as the great defender of the completeness of quantum mechanics, published his own 
response to this experiment |Boh2| . His main objection is undoubtedly to be sought in the 
following passage. 

"From our point of view we now see that the wording of the above-mentioned 
criterion of physical reality proposed by Einstein, Podolsky and Rosen contains an 
ambiguity as regards the meaning of the expression "without in any way disturbing 
the system." Of course there is in a case like that just considered no question of a 
mechanical disturbance of the system under investigation during the last critical 
stage of the measuring procedure. But even at this stage there is essentially the 
question of an influence on the very conditions which define the possible types of 
predictions regarding the future behavior of the system. Since these conditions 
constitute an inherent element of the description of any phenomenon to which the 
term "physical reality" can properly be attached, we see that the argumentation 
of the above-mentioned authors [Einstein, Podolsky and Rosen] does not justify 
their conclusion that the quantum-mechanical description is incomplete." |Boh2| 

In terms of example |2,2[ it seems that Bohr finds an ambiguity in the reasoning used to obtain 
the conclusion that both the spin along the z-axis, as the spin along the y-axis correspond 
to elements of physical reality. Apparently, some form of disturbance must be at hand. In 
my understanding, there is an ambiguity in the fact that in |EPR) it is demanded that the 
theory gives a simultaneous description of a pair of observables that cannot be measured 
simultaneously. Bohr declares such observables to be complementary. No doubt, I do think 
that Bohr's reply might tell us that we may consider quantum mechanics a complete theory 
in a certain sense (although I think more clarification is needed), but it doesn't really tell us 
why we cannot consider it to be incomplete in a different sense. Thus a search for a theory 
that is complete in the sense of Einstein, Podolsky and Rosen (i.e., a so-called hidden- variable 
theory) seems to me justified, certainly back in 1935, but even today. However, it appears 
that attempts to find such a theory that is also local are doomed to fail. 

2.3 Impossibility Proofs for Hidden Variables 

What constitutes a hidden-variable theory? Thus far, it has only been argued that quantum 
mechanics does not satisfy the criteria because of its alleged incompleteness (that is, according 
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to Einstein, Podolsky and Rosen). Let's make some seemingly reasonable assumptions on the 
structure of a theory that is supposedly complete (or at least, more complete than quantum 
mechanics). 

As in any contemporary approach to physics, suppose there is a set A called the state-space. 
The completeness claim now implies that there exist states A E A that, for each observable A, 
determine the value X(A) of that observable. Such a state will be called a pure state and it is 
supposed that A only consists of pure states. As a result, for each observable A, a function f A : 
A — > V^4 can be constructed, assigning to each state the value of the observable in that state : 

f A (X) := X(A) (22) 

Here V4 denotes the set of all possible values that A may have. It is common belief that one 
can take this to be the set of real numbers (or an n-tuple of real numbers, e.g. in the case of 
position or momentum) 



Now, the interpretation of (22) is that if a system in a state A is considered, and the 
observable A is measured, then one will find the value /.4(A) with probability one. This implies 
that measurements reveal properties possessed by the system prior to the measurement. In 
particular, the outcomes of experiments are pre-determined (unlike in quantum mechanics). 
Furthermore, if one assumes that a measurement does not disturb the state of the system, one 
automatically gains repeatability of measurements (i.e., successive measurements of the same 
observable will yield the same result). There is no need for a discontinuous state change like 
the one introduced by the von Neumann postulate. 

The statistics of quantum mechanics should be recovered by the hidden-variable model by 
introducing appropriate probability measures P on the set A (which should be turned into a 
measurable space by an appropriate choice of some cr-algebra). The expectation value of the 
observable A for the ensemble P would then be given by 

E(A)= [ ./U(A)dP(A). (23) 

J A 



2.3.1 Von Neumann's Theorem 

The impossibility proof of von Neumann as presented in |vN| is quite extensive and complex 
(it spans about ten pages, preceded by about fifteen pages of introductory discussion). In fact, 
a good understanding of the proof is hard to acquire and even in recent years explanations of 
it have been put online |Ros| , |Sin| , |Dmi| 18 

It is not surprising that the original proof appears to be somewhat vague at first sight. It is 
concerned with the question whether or not the stochastic behavior of quantum mechanics can 
be reproduced by a classical theory. However, von Neumann's book (from 1932) dates from 
before the time the mathematical axioms of classical probability were properly introduced by 
Kolmogorov |Kol3| in 1933. The clear structure as presented above therefore wasn't available 
to von Neumann at that timej^] In fact, von Neumann doesn't explicitly speak about assigning 

lr It is a remarkable accomplishment of modern science that everything is described by numbers; even 
phenomena like colors. However, it seems good to point out that we are holding on to a dogma here, and that 
one day it may appear that using numbers isn't an appropriate way to describe all phenomena. 

18 This last article actually originates from 1974, but has only been published recently. 

19 Most likely, von Neumann was acquainted with the recent developments made in probability theory, since 
he himself was also working on measure theory. Still, even Kolmogorov's work seems sometimes less formal 
from a modern perspective. 
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definite values to observables at all and doesn't make use of the notion of a probability space 
(like A). Instead, the focus is on ensembles of systems and properties of the expectation values 
for observables. Von Neumann investigates what kind of properties ensembles do have from 
the point of view of quantum mechanics, and should have from the point of view of hidden- 
variable theories. A discrepancy between these two leads von Neumann to conclude that no 
completion of quantum mechanics in terms of hidden variables is possible. 

In terms of the above described structure, one may think of an ensemble as a function 



E : O — > R on the set O of all observables, defined by equation (23). Although this is a 
good concept to keep in mind when von Neumann talks about an expectation-value function, 
it should be emphasized that von Neumann actually refers to a broader notion. In fact, 
von Neumann almost proves that the expectation-value functions that appear in quantum 



Bell inequalities, see Section 2.3.4) 



mechanics cannot be of the form (23) (this is proven more explicitly by the violation of the 



To accomplish this, von Neumann relies on four axioms for a hidden-variable theory: 

vNl For each observable A corresponding to the operator A, and for each polynomial / : R — > 
M, the observable f{A) (corresponding to applying the function / to each measurement 
result of A) corresponds to the operator f(A). 

vN2 If A is an observable that only takes positive values, then for each ensemble of systems 
one has E(A) > 0. 

vN3 For each sequence of observables A\ , A2 , • • • corresponding to operators A\ , A% , . . . , there 
is an observable A\ + A2 + • • • corresponding to the operator A\ + A% + . . .. 

vN4 For each sequence of observables Ai , A2 , • • ■ , each sequence of real numbers c\,C2, ■ ■ ■ 
and each ensemble of systems it should hold that 

E(ci Ai +c 2 A 2 + ...) = ciE(^i) + c 2 E{A 2 ) + .... (24) 



Axiom vN3 is a bit ambiguous. At first sight, it is not clear if von Neumann allows the sums to 
be infinite. It turns out in the proof that this assumption is indeed necessary. In that case, the 
following difficulty arises. In general, a sequence of operators Y27=l ^ wu ^ no ^ conver g e to any 
operator (neither uniformly, nor strongly, nor weakly). In fact, it is not even clear that if A\ 
and A2 are self-adjoint, that their sum is too (since it is not clear how to choose D(A\ + A2)). 
For sake of simplicity, one may consider only observables whose corresponding operators are 
bounded. Also, von Neumann nowhere uses vN3 in this form in his proof. Instead, one may 
introduce the following modified axiom. 

vN3' If A is an observable corresponding to the bounded operator A and Ai,Az,-- - is a 
sequence of bounded self-adjoint operators such that Y17=i ^ converges strongly to A 
(as n —¥ 00), then each of the operators Ai corresponds to a certain observable Ai- 

For the same reasons, vN4 will also be modified: 

vN4' If A is an observable corresponding to the bounded operator A, and A\,A%,... is a 
sequence of bounded self-adjoint operators and c\,C2,... a sequence of real numbers 
such that Y11=i c iAi converges strongly to A (as n — > 00), then 



E(A) = E(ci Ai +c 2 A 2 + ...) = ciE(^i) + c 2 E(A 2 ) + 



(25) 
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Note that vN3' and vN4' are in a sense the axioms vN3 and vN4 reversed. Indeed, vN3 
postulates the existence of a single observable given the existence of an entire sequence of 
observables, whereas vN3' postulates the existence of an entire sequence of observables, given 
the existence of a single observable. From the axioms presented in this way, it follows that 
each bounded self-adjoint operator should correspond to an observable. Also, it follows from 
vN4' that E(cA) = cE(A). For this reason, one can always normalize any expectation- value 
function E such that E(l) = 1 (except for the pathological case where E(A) = for all A, or 
E(l) = oo). Therefore, mainly normalized ensembles will be considered. 
Besides these axioms, von Neumann introduces two definitions. 

Definition 2.1. An expectation- value function E : O — > R is called dispersion free if 

E(^ 2 )=E(^4) 2 , VA. (26) 

This definition expresses the idea that for every observable A, its variance in a dispersion- 
free state is zero. That is, in such an ensemble a measurement of any observable A will yield 
a particular result almost surely. A hidden-variable state, then, would have to be dispersion 
free. 

Definition 2.2. An expect at ion- value function E : O — > R is called pure or homogeneous if 
for all expectation- value functions E',E" the condition 

E(A) = E'(A) + E"(A), VA, (27) 

implies that there exist positive constants c',c" (independent of A, with d + c" = 1), such 
that 

E'{A) = c'E(A) and E"{A) = cE{A), \/A. (28) 

This definition expresses that a homogeneous ensemble is not the mixture of two other 
ensembles. That is, every split made in the ensemble only gives two versions of the original 
ensemble. 

The main mathematical result by von Neumann may now be formulated as follows: 

Theorem 2.2. // vNS' holds, then for every expectation-value function E that satisfies vN2, 
vN4' and E(l) < oo, there exists a positive trace-class operator U such that 

E(A) = Tr(UA), VieO. (29) 

Conversely, if U is a positive trace-class operator U, then the expectation-value function 
E(A) = Tv(UA) satisfies vN2 and vN4 '. 

Proof: 

For a unit vector e, let P e denote the projection on the ray spanned by e, and let V e denote 
the corresponding observable (which exists according to vN3'). For any pair of unit vectors 
ei,e2, the operators F eijf , 2 and G eifi2 are defined to be 

F ei ,e 2 ^ ■= (e 2 ,Vo)ei + {e 1 ,tp}e 2 , G ei ^ 2 rb := i(e 2 ,'ip}e 1 - i(e 1 ,i/j)e 2 = F ieue2 , (30) 

or, equivalently, 

F ei ,e 2 = P( ei +e 2 )/V2 ~ ^(ei-e 2 )/\/2' ^ei,e 2 = P (i ei +e 2 )/V2 ~ P {ie 1 -e 2 )/^/2- ( 31 ) 
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One easily checks that, if e\ and e 2 are either orthogonal or identical (i.e., if P ei and P e2 
commute), then these operators are self-adjoint. The corresponding observables are denoted 
by J r ei ,e 2 an d Qe 1 ,e 2 - Note that one has F e ^ e = 2P e and G e ^ e = for all unit vectors e. 

Let IE : O — > M be given. The operator U can now be defined in the following way. Let e 
be an arbitrary unit vector and let (e n )^? =1 be an orthonormal basis of the Hilbert space such 
that there is an i with e = (note that Hilbert spaces are by definition separable in |vN| ), 
Now consider the functional 



f e : V ^ J>, e n ) -E(T ente ) + -E(G, 



n=l 



1 



(32) 



It must be checked that this limit indeed exists for each ip. Note that projection operators 
correspond to positive observables. From vN2 and vN4' it then follows that 



E(j- ei , e2 ) = nr (ei+e2)/V - 2 ) - E(r [ei _ e2)/V2 

<n-P (ei+e2)/v - 2 ) = E(i)-m-r {ei+e2 

< E(l) < oo, 

and similarly, E(Q eie2 ) < E(l) < oo for all unit vectors ei,e2- Therefore, 



(33) 



lim 



N 



n=l 



< E(l) lim 

7V-S.OO 



N 



ra=l 



E(l) < oo. (34) 



It is straightforward, though tedious, to show that the value of f e (ip) does not depend on the 
choice of the basis in which e appears. I will omit this part of the proof here. From (34) it 
follows that the functional f e is in fact bounded and hence, according to Riesz' representation 
theorem, there is a unique vector in T~L, which will be denoted by Ue, such that 



This defines the operator U. From this definition it follows that 

1 j 

(e\,Ue2) ■= -E(J r ei]e2 ) + — E(^ eije2 ), whenever P ei and P e2 commute. 



In particular, one has 



(e, Ue) = -E(J- e , e ) + l -E(G e , e ) = E(V e ). 



(35) 

(36) 
(37) 



It is now easy to show that U is self-adjoint. For each pair of unit vectors e\ , e 2 with [P ei , P e . 
one has 



(Uei,e 2 ) =(e 2 ,Uei) 
1 



^E(T e2i£1 ) - l -E(g e2 , ei : 



-E(F eue2 ) + -E(g eue2 



The general result 



(il>,U<j>) = (Ui/>, 



(ei, Ue 2 ). 

yip, <t> g h, 



(38) 



(39) 
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then follows by expanding both vectors with respect to the same basis. 

Now let A be an arbitrary observable and let A be its corresponding bounded self-adjoint 
operator. For any orthonormal basis (e n )^ =1 of T~L, write a nm := {e n ,Ae m ). Then 



oo n— 1 



^ = 52 52 ( a ^Pe n + Re(a nm )F eniem + Im(a nm )G en>em ) , (40) 



n=l m=l 

where the right hand side converges strongly to A. Indeed, for if) £ T~L one has 

N n-1 

.V — X> • 



J im 52 52 ( fl ™ p ^» + Re(a nm )F eniem + Im(a nm )G eniem ) ^ 

/V — Vr^n ■ ■ ■ ■ 



n=l m=l 

AT n— 1 oo 



l im 52 52 52 ( a nnPe n + Re(a nm ) F £n :£m + Im(a nm )G enjem ) (e j ,?fj}e j 

J — von * * r * * ■ 



7V->oo 

n=l m=l j'=l 

TV n-1 

\ — x 



lim S~] V] ( (e„,^)a n „e n + Re(a nm ) ((e m ,^)e n + (e n ,ip)e m ) 

/— »oo ' — ' * — ' \ 

n=l m=l 

+ ilm(a nm ) ((e m ,i/i) ^>e m )) (41) 

AT n-1 

lim V] 52( e ^'^ e «)( e «'' i / ; )en + (e n ,vle m )(e m ,^)e n + (e m , Ae n )(e n , ip)e m 

N— von ^ — * ^ — « 



n=l m=l 
N 



lim V] (e n , Ae m )(e m , V')e„ = A^. 
/ — v^ ^ — * 



7V-s>oo 

n,m=l 



Finally, using vN4', it follows that 

oo n— 1 



=EE («™IE(^eJ + Re(a nm )E(J- en , e J + Im(a nm )E(<? en , e J) 

n=l m=l 
oo n— 1 

= 52 52 ^(^J + Re(a nm ) Tr(UF en , e J + lm(a nm ) Tr(£/G e „, e J) 



n=l m=l 
oo n— 1 



(42) 



= 52 52 Tr ( a nnPe n + Re(a nm ) F Sn i£m + Im(a riTO )G enjem )) 

n=l m=l 

= Tr(*7A), 

where the second step almost immediately follows from the definition of U. The positivity of 
U follows from equation (37) together with vN2. From the same equation together with vN4' 
it also follows that U is trace-class. Indeed, for any orthonormal basis (ej)^ one finds 

oo oo 

^( ej ,C/ ei ) = ^E(P ei )=E(i)<oo. (43) 
i=i i=i 

The proof of the converse statement is straightforward and is omited here. □ 



From this theorem, the following corollaries are obtained. 
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Corollary 2.3. If in addition to the conditions of Theorem 2.2 the ensemble E is pure and 
U 7^ 0, there is a unit vector e and real number A such that U = \P e . Conversely, for each 
unit vector e, the expectation-value function E(^4) = Ti{P e A) = (e, Ae) is pure. 

Proof: 

To prove the first statement, let ip £ % such that Ui[)q ^ 0. Then define the operators 

U'-.rP^ fflrFTT ^O, V" -rb^Vrb-Vrb. (44) 

It follows from the self-adjointness of U that these are both self-adjoint too. Moreover, they 
are positive 
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(W V) = |#>o, «,v ^Mi^lW> t (45) 

Therefore, the expectation-value functions E' and E" associated with U' and U" satisfy E(y4) = 
E'(„4) +E"(^4). Then, because E is pure, it follows that there are c',c" such that U' = c'U, 
U" = c"U and d, c" > 0. Because U"tp = 0, it follows that d = 1 and c" = 0. 
Now set e := vm^j^n UtpQ. For every tj) € it then holds that 

(C^o,V>) ( e ,V>)||^o|| 2 ||l7^ || 2 

For the converse, let U = P e for some unit vector e and let E denote the expectation- 



value function associated with U. Suppose E' and E" satisfy (27), and let U' and U" be the 
positive semi-definite self- adjoint operators associated with these ensembles. It follows that 
U = U' + U" . Now let t/j 6 T~L be any vector and set = P e rb and if) 1 - = (1 —P e )rb. Then 

< (ip 1 , U'i) L ) < (i/r*-, U'rb 1 ) + U"ip x ) = {i/r*-, Ui/r*-) = 0. (47) 

Thus, U'lp 1 - = and also U"^ 1 - = 0. Furthermore, 

((1 -P e )U'f, (1 -P e )U'rb) = ((1 -P e )U'i>, U'i>) = {U'(l -P e )U'xb, V) = (0, 4>) = 0. (48) 

That is, U'tp £ P e H \/t/j £ U (and similarly for U"). Now set d = (e,U'e). Then U'e = de 
and 

[/V = U'tjjW = (e, %b)ce = c'Urb, VrbeU. (49) 
In the same way set c" = (e, U"e), and it follows that d + c" = 1. This shows that U is pure. □ 

In other words, this corollary states that the only possible pure states, as defined in 



Definition 2.2 are in fact the ones already given by quantum mechanics. The other corollary 



is the following. 

Corollary 2.4. If the axioms vNl and vN3' are satisfied, there are no normalisable dispersion- 
free expectation-value functions that satisfy vN2 and vN4 '■ 

20 These inequalities follow from using the Cauchy-Schwartz inequality for the mapping (tp,4>) (ip,U(f>), 
which is an inner product because U is positive. This also shows that (tpo, Uipo) > 0. 
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Proof: 

Let E be an expectation-value function that satisfies vN2 and vN4' and let U be the self- 
adjoint operator defined by this function according to the previous theorem. Let e be any unit 
vector. Because E is dispersion free, and because of vNl one has 



Tr{UP e y = E{VeY = E 0P e ) = T±(UP;) = Tr(UP e ). (50) 

Since E(V e ) < E(l) < oo, it follows that Tt(UP e ) G {0, 1} for all e. Then, since e ^ Tr(UP e ) 
is a continuous function on the unit vectors, it must be constant. Consequently, either {7 = 
or U = 1 (where O denotes the zero operator and 1 the unit operator). But if U = O one 



has E(_4) = Tr(0 -A) = for every observable A and the function is not normalisabk 21 On 
the other hand, if U = 1, for each pair of orthonormal vectors e%, ei the operator P ei + P e2 is 
again a projection, and 

2 = Tr(U(P ei + P e2 )) = Tr(U(P ei + P e2 f) 
= E((V ei + V e2 f) = E{V ei + V e2 f = Tr(U(P ei + P e2 )f = 4, 

which is again a contradiction. Therefore, there are no normalisable dispersion-free ensembles. 
□ 



The conclusion drawn by von Neumann is the following. Since there are no states that 
are dispersion free, there are no hidden-variable states that can counter the indeterminism 
of quantum mechanics. In fact, since all pure states are given by unit vectors in the Hilbert 
space, no extension of quantum mechanics is possible. In his own words: 

"There would still be the question [. . . ] as to whether the dispersions of the ho- 
mogeneous ensembles characterized by the wave functions [. . . ] are not due to the 
fact that these are not the real states, but only mixtures of several states [. . . ] 
which together would determine everything causally, i.e., lead to dispersion free 
ensembles. The statistics of the homogeneous ensemble [the ones given by the unit 
vectors] would then have resulted from the averaging over that region of values of 
the "hidden parameters" which is involved in those states. But this is impossible 
for two reasons: First, because then the homogeneous ensemble in question could 
be represented as a mixture of two different ensembles, contrary to its definition. 
Second, because the dispersion free ensembles, which would have to correspond to 
the "actual" states [...], do not exist. It should be noted that we need not go any 
further into the mechanism of the "hidden parameters," since we now know that 
the established results of quantum mechanics can never be re-derived with their 
help." [vN] 

So, if I understand von Neumann correctly, any search for a hidden-variable model will be 
doomed to fail. However, the general rule that the more complex an argument becomes, the 
more likely it will be that it is flawed, turns out to apply once again. 

2.3.2 A Counterexample 

Despite von Neumann's proof it turns out to be easy to show that hidden-variable theories 
that reproduce the quantum-mechanical statistics do exist. The simplest way of accomplishing 

21 This is also physically unacceptable. Using the words of von Neumann: "U = O furnishes no information." 



20 



Introduction to the Foundations of Quantum Mechanics 



this will be called the catalog theory (for reasons made clear later on). Let O denote the set of 
all self-adjoint operators on the Hilbert space Ti associated with the system in question. Since 
quantum mechanics has proven to be very accurate in describing phenomena, and because a 
theory is desired that completes quantum mechanics and doesn't replace it, take V4 = cr(A) 
for the set of possible values of A, where A is the self-adjoint operator associated with the 
observable A in quantum mechanics. Now the set of hidden variables is defind to be 

Acat := {A : O -> M ; \(A) G a(A) VA G O}. (52) 

A o"-algebra on this set can be formed in the following way. For each Borel subset AcK and 
observable A, let [A G A] denote the set of all states for which the value of A lies in A: 

[A G A] = {A G A cat ; X(A) G A}. (53) 

Then T, ca t will be the <7-algebra generated by all these sets. For each quantum-mechanical 



state ij), define the probability measure P,/, to be 



P^A G A] = Tr(P^ A (A)), (54) 



which ensures that ( 23 ) holds. It follows from Kolmogorov's extension theorem that this defines 
a probability measure on (A ca t, 'Scat) (I omit the details to keep the argument clear). It follows 
from von Neumann's proof that this probability measure is not dispersion free. However, it 
is a mixture of ensembles that are dispersion free (each of the A may be associated with a 
dispersion free ensemble). And although it was shown that this ensemble is homogeneous, it 
still is a mixture of ensembles. So, accepting the validity of von Neumann's proof, there must 
be a conflict with his assumptions. 



In the proof of the second part of Corollary 2.3 it was stated that if the ensemble given by 



P^p can be written as the mixture of two ensembles, these ensembles can again be associated 



with a positive trace-class operator. However, this step uses Theorem 2.2 which only holds if 
these ensembles satisfy vN2 and vN4'. And although vN2 is satisfied in this model (positive 
observables can indeed only have positive values), vN4' is not. In fact, if one considers an 
ensemble that picks out A almost surely, the axiom reads 

X(A) = E(A) = ciE(^) + c 2 E(„4 2 ) + . . . = ciX(Ai) + c 2 X(A 2 ) +..., (55) 

whenever A = c\A\ + c 2 A 2 + . . .. 

Obviously, this property doesn't hold for any of the A G A ca f. It is, in a certain sense, the 
absence of such a property that makes the theory (A ca t, T, ca t) meaningless. It is indeed just a 
big catalog of all possible measurement results, and a state is only determined if one measures 
all observables. If this is done, then one can look in the catalog to find what the state is. 
There are no laws in this theory. That is, having measured some observables, together with 
this theory, one cannot make a prediction about measurement results of any other observable 
that is any more precise then the prediction that can be made in case the earlier measurements 
hadn't been performed. For example, measuring the momentum p of a particle does not help 
to predict its kinetic energy p 2 /2m. Thus such a theory is completely meaningless, for there 
is no causal structure. We want a theory that at least says that a ball will move if we kick it. 
However, the importance of the catalog theory lies in the fact that it exposes the assumption 



(55) made by von Neumann as a rather dubious one. 



No distinction in notation is made between this measure and the probability function in ( 10 1 
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The realization that the proof of von Neumann doesn't exclude all possible hidden- variable 
theories has resulted in heavy critique on the theorem (see for example [Mer3j). I think this 
is unreasonable. As a proof for the non-existence of hidden-variable theories it may perhaps 
depend on unnecessarily strong assumptions, but as an investigation of the possible existence 
of hidden-variable theories it remains valuable. 

How strong is, in fact, assumption vN4'? It is a natural assumption that any hidden- 
variable theory should be empirically equivalent to quantum mechanics. Now the two postu- 
lates of quantum mechanics that relate the mathematical structure to empirical statements 
are the value postulate and the Born postulate. In a hidden-variable theory the first could 
easily be accounted for by demanding that V4 = cr(A) for all observables. The second one 
is harder to pinpoint. The Born postulate does in fact imply that for any three observables 
A,Ai,A 2 corresponding to self-adjoint operators A, A%, A 2 such that A = C\A\ + C2A2 one 
has 

E(A) = ciE(A) + c 2 E{A 2 ) (56) 

for every empirically admissible ensemble (i.e., for every quantum-mechanical ensemble). Is it 
then necessary that this relation should also hold for all sub-ensembles? In other words, is a 
violation of A = c\ A\ +c 2 A 2 in a sub-ensemble susceptible to empirical investigation? 

To test the relation A = c\Ai+c 2 A 2 empirically, one simply measures the three observables 



A, A\ and A 2 and checks if the relation holds. If the operators A± and A 2 commute then 



according to (property (iv) of) Theorem 2.1 all the projection operators they generate by their 
spectral measure also commute. Together with the von Neumann postulate, this implies that 
a measurement of one of the observables alters the probability distribution over the possible 
outcomes of the other observables in such a way that the relation A = c\A\ + c 2 A 2 will be 
satisfied. The procedure to test the relation A = C\A\ + c 2 A 2 described above is therefore a 



meaningful one, and it seems a reasonable assumption that equation (56) should hold for all 
sub-ensembles. However, in the case that A± and A 2 do not commute, a measurement of one 
observable in general does not alter the probability distribution over the possible outcomes 
in such a way that A = C\A\ + c 2 A 2 is satisfied. In that case, a measurement of A\ and A 2 
may best be performed by again splitting a set of systems in the same state in two parts and 
measuring A\ on one part and A 2 on the second. But this is not expressed by the relation 
A = c\ A\ +c 2 A 2 . In fact, Bell [Bcl2j pointed out tha t if one considers the observables 
associated with the operators o Zl o y and ay from example 2.2 with r = iy2(l) 0, 1), one has 



o r = -\/2 (a z + (J y ) . (57) 

However, this relation can never be satisfied empirically if each of the observables is only 
allowed to take the values -1 or 1 upon measurement (i.e., eigenvalues are not linear for 
non-commuting operators 
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The conclusion is now drawn that (56) may only seem a reasonable demand if the cor- 
responding self-adjoint operators commute. The arguments used to come to this conclusion 
appear to me to be reasonably involved and I therefore disagree with Bell and Mermin who 



stated that assumption vN4' (and even von Neumann's proof in general) is "silly"r 5 l Further- 



23 In that case they automatically also commute with A. 



24 Already in 1935 a related objection was made by Grete Hermann [HerlJ which, however, remained ignored 
for many years. The interested reader may consult [Seel and Hcr2 for more information. 

25 In |Mer3| . Mermin uses this term and defends it referring to a quote from Bell in an interview. Ever since, 
vN4' has become known as von Neumann's "silly assumption". 
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more, the question is left open of what would be a reasonable relation between observables 
for which the corresponding self-adjoint operators do not commute. Certainly, the classical 
relation between energy, momentum and position, 

2 

H = l- + V(x), (58) 
2m 

should have some meaning in a hidden-variable theory, especially since it plays an important 
role in quantum physics (as an operator equation). Should the assumption that, in a hidden- 



variable theory, (56) only holds for quantum-mechanical ensembles (i.e. those described by a 
positive trace-class operator) really be the remnant of this (once so fundamental) equation? 

At this point, it is not at all clear that a weakening of vN4', e.g., by stating that it only has 
to hold for observables whose corresponding operators commute, enables one to find a possible 
hidden-variable theory that is satisfactory. The catalog theory presented above is of course 
unsatisfactory, if only because it is empirically in violation with quantum mechanics (e.g. 
most states will violate a relation like A = c\ A\ +c 2 A2 even if the corresponding operators 
commute). The catalog theory may only be saved by introducing an ad hoc state change upon 
the performance of a measurement to ensure the preservation of physical laws. Bell showed in 
|Bel2j that a hidden-variable model that is completely consistent with quantum mechanics is 



in fact possible for the system of a single spin ^-particle; in his model equation (56) is satisfied 
for all commuting observables in all states. In that case, the dimension of the Hilbert space 
is 2. The question then arises if such a model can be extended to cover systems described by 
Hilbert spaces of arbitrary dimension. The next theorem will show that this cannot be done. 

2.3.3 The Kochen-Specker Theorem 

For a hidden-variable theory, it is demanded that there exists a set A consisting of all pure 
states. Appealing to the completeness condition of Einstein, Podolsky and Rosen, it is de- 
manded that for each observable A, there must be a map : A — > V4, assigning to each state 
the value of the observable A in that state. To obtain empirical equivalence with quantum 
mechanics, the set of possible values for A is taken to be V4 = cr(A), where o~(A) is the 
spectrum of the self-adjoint operator A associated to the observable A in quantum mechanics. 

Thus far, the set A resembles the state space of the catalog theory in the sense that it is 
completely lawless. From the proof by von Neumann it follows that demanding 

\(A) = ciA(Al) + c 2 A(^ 2 ), VA whenever A = c\A x + c 2 A 2 , (59) 

is too strict, and in fact not even necessary to obtain empirical equivalence when A\ and A 2 
do not commute. 



It seems good to point out again that von Neumann never explicitly required ( 59 ) to hold 
for the hidden-variable states. In fact, von Neumann explicitly appealed only to the statistical 
form of this law (i.e., requiring it only to hold for the expectation values), as explained 
earlier. Because it follows from the Born postulate, this is in fact a necessary requirement 
for the quantum-mechanical ensembles. In my opinion, the only flaw in the reasoning of 
von Neumann is to be found in the proof of the homogeneousity of the quantum-mechanical 



ensembles described by the unit vectors in the Hilbert space (Corollary 2.3). It was here 
that the unreasonable assumption was made that if the quantum-mechanical ensemble was a 
mixture of two ensembles, then these sub-ensembles should also satisfy vN4\ 
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In contrast with von Neumann, Kochen and Specker do explicitly work with pure states 



A. Of course, linear laws like (59) (for commuting observables) are not the only ones to be 
satisfied in the hidden- variable model. For any Borel function / and every observable A 
one can introduce the observable f(A) which coincides with applying the function / to the 
measurement result of A. Kochen and Specker made the following assumption in [KSJ: 

FUNC: If the observable A is associated with the self-adjoint operator A, then 
the observable f(A) is associated with the self-adjoint operator f(A) (if it exists). 

Note that this is a generalization of vNl. In the precise form given here, this assumption 
is somewhat implicit in [KS] . There, it automatically results since there is no distinction in 
notation between observables and self-adjoint operators. The definition of the operator f(A) 
is then given by the Borel functional calculus^] It follows that A and f(A) commute, and 
hence their corresponding observables can be measured simultaneously (their measurement 
results are always related by /, no matter in what order they are measured). This motivates 
the following assumption. 

FUNC: For each observable A and each Borel function /, any hidden-variable 
state A satisfies 

X(f(A)) = f(X(A)), (60) 
where f(A) is defined as stated above. 

Note that vN2 is a special case of this assumption. From FUNC and FUNC together it 



also follows that (59) holds whenever A\ and A^ commute. Now, if for a certain physical 



system Obs denotes the set of observables 27 the set of hidden variables that satisfy the criteria 



imposed by Kochen and Specker is given by 

A KS := {A : Obs -»• R ; \{A) G a(A),X(f(A)) = f(\(A)),V Borel functions /}. (61) 

An element of this set is often also called a valuation function . Using this definition, the 
theorem Kochen & Specker can now be formulated in the following way: 

Theorem 2.5 (Kochen-Specker Theorem). If the Hilbert space associated with the system has 
dimension greater than 2, then the set Aks *s empty. 

The proof is actually fairly long and I'll try to present it here in the form of a story, 
discussing the technical details along the way. The story focuses on special observables only, 
namely those corresponding to projection operators P in quantum mechanics, which are re- 



garded as the "yes-no"-questions (see Remark 2.1). For such operators, the following lemma 
can be proven in a very direct wayr 5 ' 



26 See for example |Con2) . 

27 Each observable may be associated with a self-adjoint operator according to the observable postulate. 
The set Obs can thus be viewed as a subset of all self-adjoint operators. 

28 In their article |KS| . Kochen & Specker refer to theorem 6 in |Neu] to state that whenever two operators 
A\, A2 commute, there exists an operator A and functions /1, /j such that fi(A) = Ai for i = 1, 2. This result 
appears also in a lot of textbooks. See for example proposition 1.21 in [Tak , which states that every Abelian 
von Neumann algebra is generated by a single self-adjoint operator. In a lot of literature, one simply refers to 
this as "a well known mathematical fact" without specifying exactly what this fact is (it took me quite some 
time to figure out what it was). To clarify the discussion, I have chosen to only prove the part that is necessary 
for this discussion. 
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Lemma 2.6. For any n-tuple P±, . . . , P n of mutually orthogonal (and hence commuting) pro- 
jection operators, there exists a self- adjoint operator A n and functions f n i, . . . , / n>n such that 
Pi = fn,i(A n ), for i = 1, . . . , n. 

Proof: 

With induction in n. For n = 1 the assertion is trivial, just take A\ = P\. For arbitrary n, 
take 

A n = aiPi + a 2 P 2 + ■ ■ ■ + a n P n , with act = 1, a k = - (l - - . (62) 

Consider the function h : C — >■ C, h : x \— > 4(x — x 2 ). One then has 

h{A n ) = A{A n - A 2 n ) = 4(ori - a 2 )P x + 4(a 2 - a\)P 2 + ... + 4(a„ - a 2 n )P n 

(63) 

= aiP 2 + a 2 P 3 + • • • + a n -\P n . 

Therefore, applying hn — 1 times to A n gives h n ~ 1 (A n ) = P n . Now suppose the lemma is true 
for certain n, for the case n + 1, take / n +i,n+i = h n (applying h n times) and for k = 1, . . . , n 
define 

frj + l k '■ C — > C; , 
/n+l,fc(») := /n,fe(f - a n+ if n+ i, n+1 (x)). 
This proves the lemma since 

/n+l,fc(Al+l) = /n,fc(Aj+l — «n+l/n+l,n+l(^4n+l)) 

= /n,fc(Ai+l — ttn+l-Pn+l) = fn,k(A n ) = P k . 

□ 



Corollary 2.7 (Finite Sum Rule). For any n-tuple P\, . . . , P n of mutually orthogonal projec- 
tion operators, every A 6 A^s must satisfy 

X(Vl + ...+V n ) = \{Vl) + ... + X(V n ), (66) 

where V% is the observable associated with the operator P{. 
Proof: 

Let P\, . . . ,P n be an n-tuple of orthogonal projection operators. Take A n and / n ,i, • • • , f n ,n 



as constructed in Lemma 2.6 and take g = f n> i + . . . + f n>n . Let A n denote the observable 
associated with the operator A n . One then has for every A £ Aks 

X(Vi + ...+V n ) = \(g(A n )) = g(\(A n ) 

= /n,l(A(A)) + • • • + fn,n(X(A n )) = \{fn,l{An)) + ■■■ + Kfn,n{A n )) (67) 

= \{Vl) + ... + \(V n ). 

□ 

Here it shows that the apparently innocent FUNC rule has serious consequences. In fact, 
the finite sum rule enables one to prove the following lemma, which, together with its corollary, 
already almost proves the Kochen-Specker Theorem. The following lemma is about four- 
dimensional Hilbert spaces. Later on, I will present a similar result for the three-dimensional 



case (Lemma 2.12). Lemma 2.8 may be seen as an appetizer for the one presented later. 
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Lemma 2.8. If Aks *s a se t of hidden variables to describe a system associated with a four- 
dimensional Hilbert space, then Aks *s empty. 

Proof: 

For any 4-tuple of orthogonal 1-dimensional projection operators P±, P 2 , P3, P4, the finite sum 
rule implies that 

X(Vi) + X(V2) + X(P 3 ) + \(V 4 ) = X(r 1 +P 2 + V3 + VA) = X(l) = h VAeA X5 , (68) 

where 1 denotes the observable associated with the unit operator 1. 

Now let A be any element of Aks- By (68), for any 4-tuple of 1-dimensional projection 
operators Pi, P 2 , P3, P4, the state A must assign the value 1 to exactly one of the corresponding 
observables, and the value to all the others. 



Table 1: The 18 vectors appearing in the proof of Lemma 2.8 Each vector appears exactly two times 



(0,0,0,1) 


(0,0,0,1) 


(1,-1,1,-1) 


(1,-1,1,-1) 


(0,0,1,0) 


(1,-1,-1,1) 


(1,1,-1,1) 


(1,1,-1,1) 


(1,1,1,-1) 


(0,0,1,0) 


(0,1,0,0) 


(1,-1,-1,1) 


(1,1,1,1) 


(0,1,0,0) 


(1,1,1,1) 


(1,1,1,-1) 


(-1,1,1,1) 


(-1,1,1,1) 


(1,1,0,0) 


(1,0,1,0) 


(1,1,0,0) 


(1,0,-1,0) 


(1,0,0,1) 


(1,0,0,-1) 


(1,-1,0,0) 


(1,0,1,0) 


(1,0,0,1) 


(1,-1,0,0) 


(1,0,-1,0) 


(0,0,1,1) 


(0,1,0,-1) 


(1,0,0,-1) 


(0,1,-1,0) 


(0,0,1,1) 


(0,1,0,-1) 


(0,1,-1,0) 



Now consider the 18 vectors in table[T] This table of vectors was first introduced by Cabello 
in [CEGAJ. Each column of the table constitutes an orthogonal basis of the Hilbert space C . 
With each of the vectors one can associate the observable that corresponds to the projection 
on the line spanned by that vector. Thus for every column in the table, the state A associates 
the value 1 to exactly one of the vectors in the column. In total, since there are nine columns, 
the value 1 would appear nine times. On the other hand, since there are 18 different vectors in 
total -and since once a vector is associated with the number 1 in one column, it must also be 
associated with the number 1 in the other column in which it appears- the number 1 should 
appear an even number of times. Since 9 is odd, the state A cannot exist. 

In terms more common in this discussion, the state A would lead to an impossible coloring 
of the vectors in tablefl] (e.g. associating 1 with the color black, and with the color white). □ 



Corollary 2.9. The set Aks * s empty if the associated Hilbert H space has finite dimension 
n > 4. 

Proof: 

Suppose Pi, . . . , P n is an ra-tuple of 1-dimensional orthogonal projection operators on H and let 
A be any state in Aks- By the finite sum rule, the state A must assign the value 1 to precisely 
one of the observables Vi, . . . ,V n . Suppose X(Pj) = 1 and let %' be any four-dimensional 
linear subspace of % containing Pj % as a subspace (i.e. Pj H C H' C TV). 

Let Pyj : H — > H denote the projection on the subspace %' . Now for any observable A 1 
associated with the operator A' acting on the space H! (i.e., A 1 is an observable for the system 
associated with a four-dimensional Hilbert space), let A denote the observable (for the system 
associated with the Hilbert space %) associated with the operator A'Py^i. 

Now the state A generates a state A' in the set A' KS of hidden variables associated with %' 
according to 

A' (.4') := A(^), (69) 
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where A is the observable generated by A' as described above. It is easy to check that this 
indeed gives an element of A' KS . In particular, for any 4-tuple of 1-dimensional orthogonal 
projection operators P[, P 2 , P%, P A (acting on %') one has 

X >(p[) + \>(p> 2 ) + A'(^) + X'(V' A ) = \'(V{ + P 2 + V 3 + V' A ) = X(V n >) = 1. (70) 

This is sufficient to show that A' (and therefore A) does not exist, using the proof of Lemma 
[2781 □ 



Thus far it is accomplished that a hidden-variable theory (that has the structure of the 
set cannot be used to describe a system whose associated Hilbert space has finite 

dimension n > 4. The generalization to separable Hilbert spaces is more technical and in fact, I 
haven't found any article (about the Kochen-Specker Theorem) that handles the complications 
involved in detail. This is a bit surprising, since the Kochen-Specker Theorem would loose 
some of its power if it would not apply to separable Hilbert spaces. One may, for example, 
argue that every physical system is in fact associated with a separable Hilbert space. Even the 
simple example of a spin-| particle usually described by the Hilbert space C 2 , is in fact only 
correctly described by the Hilbert space L 2 (M 3 )<g>C 2 (since one cannot exclude the possibility 
that the particle always has some freedom to move in space). From this point of view it may 
seem possible (though unlikely) that the extra structure provided by the infinite-dimensional 
Hilbert space does allow for the existence of hidden variables. 

The problems involved in the infinite-dimensional case become clear when looking at the 



proof of Corollary 2.9 To construct an appropriate four-dimensional subspace, it is necessary 
that for each A, one can choose a projection operator P\ for which A("Pa) = 1 an d P\ H has 
dimension at most 4. But although for every projection operator on the separable space T~L 
one can establish whether A('P) equals 1 or 0, it is not trivial to find a /imie-dimensional 
projection operator P for which X{V) = 1. Fortunately, that such a projection can indeed be 
found is proved by the following lemma. 

Lemma 2.10 (Infinite Sum Rule). For each sequence of 1-dimensional mutually orthogonal 
projection operators P\, P2, . . . with YlnLi -Hi = 1 (strongly) and for each A G A-ks> there is 
exactly one n such that X(V n ) = 1. 

Proof: 

Consider the (bounded) self-adjoint operator A := Y^Li n^ n an< ^ define functions (fn)%Li as 
follows 

The Borel functional calculus then implies that f n (A) = P n for each n. It then follows from 
the FUNC rule that 

00 00 00 00 

Y^KVn) = Y J KUA)) = J2fn(X(A)) = 1 = = A(j>„), (72) 

n=l n=l 

where I used X)nLi fn(X(A)) = 1, since 



n=l n=l n=l n=l 



A(A)ea(i) = {i ;ri 6N\{0}}. (73) 
Therefore, there is exactly one n such that X(V n ) = 1. □ 
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Corollary 2.11. The set Aks i- s empty if the associated Hilbert T~L space is either separable, 
or has finite dimension n > 4. 

So the only thing left to prove is that the set Aks 1S a ^ so empty in case dimH = 3. As 



seen in the proof of Lemma 2.8 it is sufficient to show that there exists no map that assigns 



either the value or 1 to each observable associated with a projection operator, such that for 
any triplet of mutually orthogonal 1-dimensional projections Pi,P2,P$ (often called a triad) 
precisely one is assigned the value 1. That is, only the following lemma has to be proven. 

Lemma 2.12. Let Vi(C 5 ) denote the set of all 1-dimensional projection operators on C 3 . 
There exists no map v : "Pi(C 3 ) — > {0, 1} such that for each triad P\,P2,Ps, there is precisely 
one j £ {1, 2, 3} with v(Pj) = 1. 



Table 2: The 33 vectors appearing in the proof of 
Lemma 



2.12 



ei 



ft 



ft 




si 



hi 



9l 



9l 



hf 



9i 



h\ 







"3 



V6 
-V3 
V2 

% 

v/3 
v/2 



-V3 
V6 
V2 



ft 




9l 



l>2 



9l 



9l 



hi 



9i 



14 



VE 
o 

Vs. 
J2 

V3 




v/6 

v& 



Vs. 



e3 





onal vectorsF^I are listed in table [3] 



Proof: 

For the proof a specific finite subset of V\ (C 3 ) 
is considered, and it is shown that no map 
satisfying the desired properties on this do- 
main can exist. In the original proof |KS| . 
Kochen and Specker used a subset consisting 
of 117 elements. The proof discussed here is 
based on a modification due to Peres [Per 3] . 
It starts with 33 projections, taken to be the 
projections on the lines spanned by the unit 
vectors in table [2] 

With these vectors, 16 triads in total can 
be constituted. In each triad, precisely one of 
the vectors must be assigned the value one. 
Besides these triads, there are 24 pairs of or- 
thogonal vectors. For a pair of orthogonal 
vectors v\,V2 consider the following modifi- 
cation of the sum rule: 



v(V Vl ) + v(V V2 ) 

< v{V Vl ) + v(V V2 ) 



+ v(V vlXV2 ) 
= 1, 



(74) 



where x denotes the exterior product. This 
implies that whenever one of the observables 
V V1 , V V2 is assigned the value 1 by v, the 
other must be assigned the value 0. The 16 
triads, together with the 24 pairs of orthog- 



29 In |PMMM] it is argued (among other things) that the 24 extra vectors needed to accomplish (741 are 
essential to make the proof constructive. This would imply that the finite subset of Vi(C 3 ) for which the map 
v doesn't exist should actually consists of 57 elements instead of 33. An interesting consequence is that, from 
this point of view, the leading score in the competition of finding the smallest set for which v does not exist, 
is no longer held by Conway and Kochen |Per41 p. 114] (requiring a set of 31/51 elements) but by Bub |Bubl| 
(requiring a set of 33/49 elements). 
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Figure 1: A schematic view of the consequences for assigning the value 1 to the vectors e\ and g\. 

A black marking depicts the assignment of the value 1 to the vector, whilst an absent 
marking depicts the assignment of the value 0. A line between two vectors denotes that 
these two vectors are orthogonal (only the lines used in the proof are drawn) . 



Table 3: The 16 triads constituted with the vec- 
tors of table [2] together with the remain- 
ing 24 orthogonal vector pairs. 



Because the set of vectors from tabled is 
invariant under permutations of the x, y and 
z axes, any of the vectors ei,e2,e2 may be 
chosen to be assigned the value 1 without loss 
of generality. Let e\ be assigned the value 1 
(for any v one may label the axes such that 

As a consequence, e2 will be assigned the 
value 0. Therefore, from each of the pairs 
g\-,g\ an d 92 1 92 exac tly one must be assigned 
the value 1. I will first show that assigning 
the value 1 to g\ leads to a contradiction. 
The line of reasoning is also depicted in figure 

m 

When g\ is assigned the value 1, it im- 
mediately follows that h\ and h\ should be 
assigned the value 0. Also, the vectors f\ 

and fi must be assigned the value 0, since they form a triad together with e±. Consequently, 
from the triads h\, h\, f\ and h\,h\, ff, the vectors h\ and h\ must be assigned the value 1. 
This results in g\ and g^ being assigned the value 0. As a next step, note that g^ and g| 
must be assigned the value 1 (because they appear in the triads 63,(73,(7! and 63,(73,(7!). This 



ei, e 2 , e 3 


ei,fl,ff 








ei,ffi,ffi 


e2,92,92 


ei,g\,gl 




ei,Si,Si 




ei,g'i,gi 




A 1 , fci, ft? 


fk 1 ft^ ' ft! 


f 1 h 1 h 2 

J 3 > n 3 ' n 3 




fbh\,h\ 


/ 2 , ft 2 , ft| 






9i,fts 


ghhl 


gf,h 2 




Si. ft! 


s?,ft| 


gU4 




9lh\ 


girt 


ghhl 


gl,h% 
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M 


M 




g\,h\ 


gl,h\ 


9lh\ 


9b 


g\,h\ 


gl,h\ 
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implies that h\,h\,h\, h\ are assigned the value 0. One then infers that both f\ and /| must 
be assigned the value 1. But this is a contradiction, since f 2 l and /| are orthogonal. Thus, g| 
cannot be assigned the value 1. 

Because of the symmetry around the a>axis, the same line of reasoning can be used to 
show that g\ cannot be assigned the value 1. The only option left is to assign the value 1 
to both g\ and g\. This will also lead to a contradiction. Indeed, in this case it follows that 
h\,h\,h^ and h\ must be assigned the value (see figure^). Consequently, both f% and /| 
must be assigned the value 1 which is in violation with ( |74[ ). 

This completes the proof. □ 




So Lemma 2.12 and Corollary |2.11| together constitute a proof of the Kochen-Specker 
Theorem. Lemma 2.12 is often seen as the key element of the proof, named a coloring theorem. 
A set for which a function v does not exist is called not KS-colorable (or simply not colorable) . 

As with von Neumann's theorem, the Kochen-Specker Theorem has also attracted some 
criticism (see for exampl^^j [Bcl2 j . [HR] . |Mer3] ) . The arguments used however, are more in- 
volved. 

A technical objection may be raised against the FUNC assumption. In itself, this seems 
a very natural assumption to construct a theory that is not lawless, but it relies on the more 
complex assumption FUNC together with some mathematical considerations. In the case of 
the finite sum rule, it was shown that the technicalities could be dealt with easily. Indeed, 
for a simple polynomial function /, the observable f(A) is quite naturally associated with the 
operator f(A) since f(A) has a clear explicit definition. But in the case of the infinite sum 
rule, the assumption FUNC appears to me to be more artificial. 

Not every text that presents the Kochen-Specker Theorem uses the FUNC rule, so objec- 



The criticism is in fact much more moderate than the criticism von Neumann attracted. Note that the 
article by Bell dates from before the Kochen-Specker Theorem. In fact, in this article Bell criticizes a similar 
result proven by him in the same article. The theorem is therefore also often called the Bell-Kochen-Specker 
Theorem. 
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tions related to precisely this rule aren't very strong. For example, in [CEGAJ one simply 
starts with the assumption of the infinite sum rule. I will now present a possible (though not 
very strong) direct motivation for this rule. 
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Consider an observable A corresponding to the operator A. A measurement of A can be 
interpreted as asking the system the question: "What is the value of A?" This formulation 
is a bit sloppy since the discussion is exactly about whether or not a value can be assigned 
to the system at all times. One may instead, read this question as "What is the value of A 
after the measurement?", at least, in case one accepts that "the value of A' has a meaning if 
the system is in an eigenstate of the operator A. Otherwise, one may read this question as 
"What will be the value yielded when I perform a measurement of A?" or, if one rejects the 
possibility of infinite precise measurements: "What subset of cr(A) will emerge when I perform 
a measurement that I associate with the measurement of AT' Whatever question the reader 
may prefer, let's agree to abbreviate it with "What is the value of AI" 

Conversely, it seems reasonable that similar, physical meaningful, questions could be as- 
sociated with operators. Suppose A C cr(A). One may ask the question "Does the value of 
A lie in A?" In fact, one can associate an operator with this question, namely the projection 



operator //^(A) from Theorem 2.1 Now let (Aj)j £ j be any partition of cr(A). A measurement 



of A may then be interpreted as simultaneously asking all the yes-no questions 32 "Does the 
value of A lie in Aj?" for i E I. Since a measurement will yield only one result, precisely 
one of the questions must be answered yes, and all the others must be answered no. This 
motivates the (infinite) sum rule. 

A possible objection against this line of reasoning is that it is in general not clear whether 
or not a sequence of mutually orthogonal projection operators (Pj)j g / can be associated with 
a single observable such that Pi = ^(Aj) for all i for some partition (Aj), g / of cr(A). This 
exposes a second problem: Why should the self-adjoint operators mentioned in the proof 
indeed correspond to observables 



:>,:>> 



A short investigation of this problem will reveal yet another objection when considering 
the proof of Lemma 2.8 In this setting, for each column of table [T] one would have to find an 



observable Ai and a partition (A^)^ =1 of cr(Ai) , i = 1, . . . , 9 (since there are nine columns) 
such that each vector in the ith column is associated with the observable corresponding to 
^^(A^) for some k. However, from this point of view, it is not at all clear why, whenever 
HAi (Af) = /lAj(Aj) = P (i 7^ j), they should both be associated with the same observable 
V and why they should be assigned the same value. One may argue that the observable 
associated with /^(A^) is in fact not the same as the one associated with (A' ), especially 
since the observables A% and Aj cannot be measured at the same time. This leads to the 
following definition. 

Definition 2.3. A theory is called contextual if the value of an observable may depend on 
the measuring context. That is, it may depend on what other observables are being measured. 

In the terms of the observables Ai,Aj and V, contextuality states that the value of V may 
depend upon whether it is measured simultaneously with Ai or with Aj. This seems a natural 



31 A more extensive investigation of possible starting assumptions and how they are related can be found in 

EH- 

32 See also Remark 2.1 

33 In the von Neumann proof, it was simply postulated (though implicitly) that the converse of the observable 
postulate should hold. One may, of course, postulate the same for the Kochen-Specker Theorem, but that seems 
unreasonable to me (see earlier considerations when introducing the observable postulate on page[5j|. 
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assumption if V is considered to be the observable corresponding to fij^ (A^) in one measuring 
context, whilst corresponding to (A^) in the other. But what if V has a meaning all of 
its own, independently of the observables Ai and Ajl In that case, contextuality becomes 
a stronger assumption. It then states that the value of V may depend on the way we (as 
experimenters look) at it. 

It turns out that if one wishes to use the contextuality argument to criticize the Kochen- 
Specker Theorem, one necessarily has to adopt this stronger form. This is because the opera- 
tors that play a role in the proof of Lemma [2. 12| can each be associated with a single observable 
that has a clear physical meaning independently of its measuring context. Indeed, every vector 
can be associated with the squared spin of a spin-1 particle along that axis. The operators 
associated with these observables commute if and only if the associated axes are orthogonal. 
Thus ever triad can be associated with a set of three observables that can be measured at 
the same time. It is predicted by quantum mechanics that such triplet measurements always 
yield twice the value 1, and once the value 0. The proof of Lemma 2,12| thus also shows that 
a (non-contextual) definite value assignment to these observables is impossible. 

Bell [Bel2j argued that the value assigned to any observable may well depend on which 
other observables are to be measured. This is because different sets of observables must be 
measured using different measuring devices. This is, in fact, in line with (and partly inspired 
by) the philosophy of Bohr, who thought that observables have no real meaning at all without 
specifying the measuring context. But, for me, the idea that this may be so, i.e. that the value 
of an observable depends on the way we measure it, is a substantial compromise to what is 
to be expected from a hidden-variable theory. It parts from the idea that observables should 
have a definite value independent of the observer. Quite a step for someone who was 'against 
'measurement" [Bcl5j. 



2.3.4 The Bell Inequality 



Instead of studying the possibility of hidden- variable theories in a broad senst 34 the argument 
of Bell focuses on one specific physical system: that of a pair of spin-j particles. The original 
first paper [Bell] does in fact not account for contextual theories (i.e., it assumes definite values 
for observables independent of the measuring context). The argument was improved in |Bel3| . 
where Bell incorporated the possibility that the actual value obtained when measuring an 
observable may also depend on the setting of the measurement device. Later, Bell's argument 
has also been extended to incorporate stochastic hidden-variable theories (i.e. theories in 
which measurement results are not necessarily pre-determined, but do obey certain other 
requirements). To make the argument for stochastic theories more comprehensible, I will 
first discuss a Bell-type argument for deterministic theories. A more extensive account of the 
variety of Bell arguments can be found in |CS| and |See3| . 



Deterministic Hidden Variables 

I will present the argument here in a way that is slightly different from what is common in most 
literature. The differences will become more apparent along the way. The main difference is 
that I start from the hidden state of the system A that is supposed to determine all outcomes 

34 The proof by von Neumann focuses on ensembles for arbitrary physical systems and also the Kochen- 
Specker Theorem is a statement about all physical statements that are described by a Hilbert space of dimension 
greater than 2. 
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of all possible experiments for all possible contexts, whereas one usually argues from the value 
of a specific observable as a function of several hidden variables, which may then be 'local' or 
'non-local'. 

The system under consideration is again that of the two spin-^ particles of Example 



2.2 



The question will be whether or not a hidden-variable theory can reproduce the statistics that 
quantum mechanics dictates if the system is prepared in the state 

^ = -^(0,1,-1,0). (75) 

First, an examination of the structure of a contextual hidden- variable theory is necessary. 
Since it is now allowed that values of observables may depend on measuring contexts, one 
should reconsider the notion of a state in a hidden-variable theory. Now a state A should 
assign a value in cr(A) to each observable A, for each measuring context C in which it is 
possible to measure A. To give a precise meaning to this requirement, the following definition 
is helpful. 

Definition 2.4. A measuring context C is a set {Ai ; i € 1} of observables for which all 
corresponding self-adjoint operators {Ai ; i E 1} commute. 

By an appeal to the FUNC rule, each measuring context {Ai ; % & 1} might be represented 



by the commutative W*-algebr£ 35 of operators generated by the set {Ai ; i S J}. Indeed, this 
construction exactly adds all the observables A to the measuring context for which there exists 
an i S / and a Borel function / such that A = f(Ai) for some i. However, since no special 
structure (like that of a W*-algebra) is required to make the argument of this paragraph 
work, it seems redundant to demand it. In fact, the FUNC rule may better be avoided (in 
order not to make unnecessary assumptions) and an observable of the form f(A) may only 
be interpreted as applying the function / to the measurement result of A (that is, it is not 
required that it is associated with a self-adjoint operator). 

Definition 2.5. A contextual pure state A is a rule that to each measuring context C assigns 
a function Ac : C — > R such that Ac (A) G cr(A) for all A S C. 

It turns out that the set of all contextual pure states is large enough to avoid a conflict with 
quantum mechanics. In fact, contextual hidden-variable theories do exist, Bohmian mechanics 
probably being the most popular one QBohl], see [Tumlj for a friendly introduction). The 
main objection against Bohmian mechanics is that it is non-local i.e., it allows action at a 
distance. This is what inspired Bell to write his article |Bell| : he wanted to investigate whether 
this non-local behavior is a necessary property of any hidden-variable theory. It turns out that 
it is, at least if one adopts the following definition of locality]^] 

Definition 2.6. A contextual pure state is called local if for any pair of observables Ai,A2 
corresponding to commuting operators A\ , A<i that can be measured using measuring devices 



35 A W*-algebra (or von Neumann algebra) is a set of bounded operators that is closed under taking adjoints 
and that is equal to its double commutant, where the commutant of a set of bounded operators V is the set of 
all bounded operators that commute with every element of V . There are also more abstract definitions that 
do not require the notion of an operator on a Hilbert space (see for ex ample |Davj . 



36 One may easily check that also the catalog theory of Section 2.3.2 must indeed be non-local if it is modified 
to be empirically equivalent with quantum mechanics. 
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that may be separated at an arbitrary distance from each other, and for every measuring 
context C containing At and A2, one has 

Ac(A) = A c \ M2} (^i) and X C (A 2 ) = X c \ {Al} (A 2 ). (76) 

Such a pair of observables will be called separable. The set of all local contextual pure states 
is called A BeU . 

One may think of separable observables as observables of separated systems that are now 
combined into one system. That is, the system is described by a Hilbert space of the form 
Hi®T-L2- Two observables may then be called separable if their corresponding operators 
are of the form A\ ® 1 and 1 ®A%. Thus A\ is an observable of the separate subsystem 
described by the space Hi and A2 is an observable of the subsystem described by the space 
%2- Consequently, a local pure state must assign a value to the observable A\ in the context 
C independently of any observable in C associated with a spatially separated subsystem. In 



other words, equation (76) may be itterated an arbitrary number of times until the context C 
no longer contains any observables associated with a spatially separated subsystem. 
The theorem due to Bell may now be formulated as follows. 

Theorem 2.13. There is no local, contextual hidden-variable theory (and consequently no 
non- contextual one) that can reproduce the statistics of the system of two spin-\ particles 
prepared in the state yb = ^(0,1,-1,0) (that are space-like separated). Therefore, any such 
theory is empirically in disagreement with quantum mechanics. 

Due to the extra structure acquired (compared to A ca t or Aks) by introducing contex- 
tuality, an examination of how the statistics of quantum mechanics are to be reproduced is 
required before the theorem can be proven. This is done in the same way as in classical 



statistical physics (as was also roughly sketched in the beginning of paragraph 2.3). 



A macro state in the hidden-variable theory is a probability distribution [i over the set 



A_B e ;iFn One usually thinks of the system as an ensemble of systems, or one can think of a 



system whose pure state fluctuates rapidly in time. 38 To talk sensibly about a probability 
measure on the set As e lh a cr-algebra T,B e ll of subsets of A Bell is required. 

This cr-algebra is constructed as follows. For each observable A corresponding to some 
operator A, for each measuring context C such that A £ C and for each Borel set A C cr(A), 
consider the event that the measurement of A yields a result in A in the context C. This event 
is identified with the set 



[Ac e A] := {A e A Bell ; X C {A) e A}. (77) 

For fixed C, let T,c be the a-algebra generated by all such sets. Then take T^Bell to be the 
cr-algebra generated by all these a- algebras. For a fixed measuring context C, the following 
equivalence relation on Asell is introduced: 

A~ C A' X C {A) = X' c (A),WAeC. (78) 

37 The term 'macro state' is taken from statistical physics where it is also used to describe probability 
distributions over pure states (also called micro states). The term may seem a bit awkward, since it describes 
what may be called a micro state in quantum mechanics. However, it is precisely this viewpoint (i.e., that the 
quantum states are micro states) that is criticized by Einstein, Podolsky and Rosen. 

38 The philosophy of the interpretation of statistical mechanics is a topic of its own |Skl| . 
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This defines a set of equivalence classes 

Ac := AW ~c, [A] c := {A' G A B eii ; A' ~ c A}. (79) 

For any 5 £ Sc, it follows that if A G 5, then [A]c C 5. Therefore, the cr-algebra £c extends 
in a natural way to a cr-algebra on Ac by taking 

{{[A] c ;AeS};5GS c }. (80) 

For notational convenience, both cr-algebras will be denoted by the same symbol. Also, no 
distinction in notation will be made for their elements. Any macro state fi now extends to a 
probability measure Pc on (Ac, £c) for any measuring context C by taking 

Fc(S) := u.(S) = I l s (A)d/i(A), VSGSc. (81) 

J ^Bell 

These restricted probability spaces (Ac, Sc, Pc) have the advantage that any observable A G C 
can be viewed as a stochastic variable on Ac by taking 

A([\}c) := Xc(A), Ag[A] c , (82) 

which is well defined (i.e. independent of the choice of A). 

Here one finds a relation between the usual way in which the argument is presented with 
A as a function of the hidden parameters on the left-hand side, and the way I present the 
argument on the right-hand side. From this equation, the difference between the two ap- 
proaches can also be explained. In the more common discussions, A is seen as a function of 
the state of the system and the measurement context separately. One would write something 
like A(X,mc) where A is a hidden parameter associated with the preparation of the system 
and mc is a hidden parameter associated with the preparation of the measurement device. 
Both elements are here incorporated in the use of the state A, which may of course depend on 
much more than only the system preparation and measurement preparation. 



Now consider two separable observables Ai , A2 ■ Because of the locality assumption ( 76 ) , 
it follows that 

= A{_4i„4 2 }(-4i) = A-fyt^CAi) = ^iQA]^}); ^ 
^([A]^,^}) = X{a 1 ,A2}(^) = A{_4 2 }(-4 2 ) = ^2([A]{^ 2 }). 

Now return to the example of the two spin-| particles. Let a Tl be the spin of the first 
particle along the ri-axis (one should actually write cr n <8> 1) and a r2 the spin of the second 
particle along the r2-axis. Clearly, these are separable observables. Viewing them as stochastic 



variables in the measuring context {a ri , ov 2 }, the expectation value of their product 39 for a 
fixed macro state u, is given by 



E {a ri ,a r2 }^ ri (Jr 2 ) = / <T n ([M{<T rl ,<7r 2 }K 2 ([M{<T ri ,<Tr 2 }) d MA) 

= / A {CTri} (cr n )A {(7i . 2} (cr r2 )d^(A). 

J Bell 

This equation is the key to the proof of the following lemma. 



39 Sometimes this is called a correlation, but I will refrain from using that term so as to avoid confusion 
with the mathematical term. 
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Lemma 2.14. Let a ri and a r ^ denote the spin along the r\ and r[ axes of the first particle 
and let a r2 and a r / denote the spin along the T2 and r' 2 axes of the second particle. Then 
any macro state fi of the local, contextual hidden-variable theory satisfies the following Bell 
inequality: 

E W ri ,*r 2 } ( CT n o- r2 ) - E Wi >CTr , } (<7 ri a r , 2 ) + E {(Tr , ^ } (a r[ a r2 ) + E K , ^ } (a r[ ) < 2. (85) 
Proof: 

The first term in the inequality can be estimated in the following way: 

•••= / ^WJ^rJ (A {CTr2 }(cj r2 ) - A {tr } (<v)) d/x(A) 

< / A {ari} (a ri ) A {CTr2 }(a r2 ) - A {<v} (<v a ) d/x(A) (86) 



/ 

J ^Bell 



A{a r2 }(^r 2 ) - A {<v} (<T r /) d/i(A) 



Similarly, 



E K^ ,ffra } (<M ^r 2 ) + E { ^, >CTr , } (<7 r , <7 r # ) = ■ ■ ■ 

" = J K X {<r r [}( a rO ( X Wr 2 }( a r 2 ) + A K/}( CJ r 2 )) d/i(A) 

< / A {(7r , } (cj r /) A {tr }(*„) + A {<r } (<v a ) d/i(A) (87) 
= / A {(7r2} (a r2 ) + A {(7r , } (cj r2 ) d/i(A). 



Afleii 

Next, note that As e ;; can be split to four disjoint pieces 

A ++ = {A e Ase/i ; A {(Tr2 }(cr r2 ) = A {(Tr ,}(o- r2 ) = 1}; 
A + _ = {A e A.Beii ; A{ CTr2 }(o- r2 ) = -A {(Jr ,}(cr r2 ) = 1}; 
A -+ = {A e A Be H ; A {(Tr2 }(cr r2 ) = -A {(Tr ,}(a r2 ) = -1}; 
A — = {A e A Be ii ; A{ CTr2 }(o- r2 ) = A {(V }(<7 r2 ) = -1}, 



(88) 



so that h-Beii = A++ U A + _ U A_ + U A One easily checks that on each of these pieces one 

has 

\ar 2 }( a r 2 ) ~ A{^}(o"r 2 ) + \a r2 }( a r 2 ) + \ {a ^ } (oy/) =2. (89) 
This leads to the estimate 



E Wr 1 ,«r 2 }( (J r 1 0-r 2 ) - E{ CTrii(v }(0- ri a r 



+ 



E W r [ >°r 2 } (<M 0"r 2 ) + E {(7r , i(Tr , } (cr r / CT r ; 

< f 2d/x(A) = 2, (90) 
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which proves the lemma. 



□ 



With the use of this lemma, Theorem 2.13 can now be proven. 



Proof of Theorem 2.13 



All that has to be shown, is that the inequality (85), can be violated in quantum mechanics. 



Using the notation of Example 2.2 setting e\ = (1,0) and e 2 = (0, 1) and taking ip as in (75), 



the quantum-mechanical equivalent of (84) can be calculated: 



E^(o- ri oy 2 ) =(ip,a ri (8 a r2 tp) 



- (ei <8> e 2 - e 2 <8 ei,a ri <8> a r2 (ei ® e 2 - e 2 (8 ei)) 

1 / V 1 / 

- (ei <g> e 2 , cr ri <8) oy 2 ei <8 e 2 ) - - (ei <8 e 2 , cr ri <8) <T r2 e 2 <8 ei) 



(91) 



1 / i 1 / 

+ g \ e 2 ® ei, cr n (8) o> 2 e 2 ® ei) - - (e 2 (8) ei, <r ri 



<r r2 ei (8 e 2 ; 



Writing out these inner products and using some trigonometry leads to the following relation: 
1 



E 1 p(a ri a r 



cos(i/?i) cos(<^ 2 ) 



— -(cos(i?x) sin(^i) — i sin(i?i) sin(yji))(cos(i?2) sin(( / 9 2 ) + i sm(i? 2 ) sin(</? 2 )) 



cos((/?i) cos(</? 2 ) 

(cos(i?i) sin((/?i) + i sin(i?i) sin(</?i))(cos(i9 2 ) sin(y> 2 ) — * sin(i9 2 ) sin(</? 2 )) 



cos(i^i) cos((/3 2 ) — (cos(i?i) cos(i? 2 ) + sin($i) sin(i9 2 )) sin(yji) sin(<^ 2 ) 
cos(^i) cos((/9 2 ) — cos(i?i — i9 2 ) sin((^i) sin(t/? 2 ). 



(92) 



To simplify, take ip\ = y? 2 = 7r/2, so that E^((7 ri 0y 2 ) = — cos(i?i — $ 2 ). With this expression, 



( 85 ) becomes 



|cos(i?i - 2 ) - cos(t?i - tf 2 )| + |cos(i?i - 2 ) + 003(1?; - tf 2 )| < 2. (93) 
It is not hard to see that this inequality can be violated. For example, if one takes 

^1 = 0,^ = 1,^ = ^,^ = 1^, (94) 
this inequality would result in 

2V2 = A\y/2 = |cos(0i - 2 ) ~ cos(t?i - tf 2 )| + |cos(0i - # 2 ) + cos« - ^)| < 2, (95) 

which is a contradiction. This completes the proof. □ 



Stochastic Hidden Variables 

The starting point for a stochastic hidden-variable theory is again a set A of pure states that 
may be contextual in nature. However, instead of assigning a value to each observable, a 
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pure state A assigns to each observable A a probability measure on the space (a (A), E^) for 
each measuring context C with A G C, where is the a- algebra of Borel-sets in cr(A). This 
measure will be denoted by 

A i-)- Tc[A G A|A]. (96) 

It is further assumed that there is a way to conditionalize. That is, if the measurement 
of the observable A\ yields some subset Ai G as a measurement result, the pure state of 
the system may be altered going from A to A'. Instead of introducing some notation for the 
altered state, I introduce the following notation to denote that the state may be altered by a 
previous measurement: 

P C [A 2 G A 2 | Ai G Ai, A] := T C [A 2 G A 2 |A'], (97) 

for any A 2 G C, A 2 G XU 2 . 

The reader may notice an ambiguity in this notation. If the state A changes into the state 



A', all probability distributions may be altered. That is, (97) is supposed to hold for arbitrary 
observables A 2 in arbitrary contexts C that do not necessarily contain A\. It then seems 
natural to assume that the new state may depend on the context in which the result A\ G Ai 
was obtained. To incorporate this possibility, the notation should in fact look something like 

F C [A 2 e A 2 \Ai e Cl Ai,A], (98) 

where A\ Gc x Aj denotes that the result A\ G Ai was obtained in the measuring context C\. 
However, in what follows it will be the case that, once a context is chosen, it remains fixed for 
all subsequent measurements and one may think of Pc[y4. 2 G A 2 | A\ G Ai, A] as shorthand for 
PcMb G A 2 | A\ Gc Ai, A]. It may also be noted that this way of conditionalizing is different 
from the one in standard probability theory. For example, in general it is not to be expected 
that, whenever A\,A 2 G C, the equality 

P C [A 2 G A 2 | Ai G A 1; A] F c [Ai G Ai| A] = F c [Ai G Ai| A 2 G A 2 , A] P C [A 2 G A 2 |A] (99) 

will hold for all A. 

A macro state of the system will again correspond to a probability distribution over the 
pure states. At first glance this may look like a distribution over distributions. However, the 
pure states are not probability distributions themselves but rather collections of distributions. 
Also, one is encouraged to think of the pure states as complete descriptions of the system 
(in the sense that they contain maximal information about the system), and to interpret the 
macro states as descriptions of the system based on incomplete information (hence as not 
exactly knowing what the pure state is). Strictly speaking, this definition only makes sense 
if the space A is endowed with a a- algebra £ of subsets. This is introduced in the following 
way. For a fixed context C, observable A G C and Borel set A G Y*a there is a map A — > [0, 1] 
given by A I—)- Pc[-4 G A|A]. £ is taken to be the smallest a- algebra that makes all these maps 
measurable. 

A macro state \x can be used to assign probabilities to actual events by means of the 
stochastic variables A i— > Pc[-4 G A|A] by taking 

P c [Ae A\fj] := [ F C [A G A|A]d/i(A). (100) 

This is to be read as the probability to find a value in A if one measures the observable A in 
the context C, given that the state is fi. It is good to note that this is not the probability of 



38 



Introduction to the Foundations of Quantum Mechanics 



an event, that is, it is not the measure of some subset of A, but rather the expectation value 
of the variable A i— )■ Pc[«4 G A|A]. This does indeed seem the only natural way to make sense 
of probabilities of experimental events in this abstract mathematical context in such a way 
that 

P c [Ae A\S X ] := [ P e [Ae A|A']d<5 A (A') = PcH e A|A], (101) 

J A 

where 5\ is the macro state given by 

8 X (S) = \ 1, X£ J ] VS 1 G S. (102) 

In the same way, conditionalized probabilities may be introduced by taking 

V C [A 2 G A 2 |A G Ai,/x] := / Pc[A 2 G A 2 | At G Ai, A] du-(X), 

J A 



(103) 



and the expectation value of the observable A in the context C given the state [i is defined to 
be 



E C [A\»] := / E c [A\\]du.{\) 

(104) 

= / / xF c [A£ dx|A]d/i(A). 

J A Ja(A) 

As was done for deterministic theories, a condition of locality is now introduced. For the 
deterministic models it was implicitly assumed that measurements do not disturb the state of 
the system, whereas here disturbances are allowed. However, in order to proceed one needs 
to assume that such disturbances may only have a local effect. Consequently, to obtain a 
Bell-inequality, two locality assumptions are made for the stochastic model. 

Definition 2.7. 

(i) A state \x is called outcome independent (OILOC) if for any pair of separable observables 
A±,A 2 , corresponding to commuting operators Ai,A 2 , for every measuring context C 
containing A\ and A 2 , and for every pair of measurable sets Ai C o-(A\) and A 2 C 
cr(A 2 ), one has 

T c [Ai G Ai| A 2 G A 2 , iA = ¥ c [Ai G A|4 (105) 

(ii) A state \i is called context independent (CILOC) if for any pair of separable observables 
A\,A 2 , corresponding to commuting operators A\,A 2 , for every measuring context C 
containing A\ and A 2 , and for every measurable set A C cr(Ai), one has 

Vc[Ai e A|/x] = P C \ M2} [^i g A|4 (106) 

These two locality criteria are due to Jarrett [Jar) (similar criteria are also discussed in 
[vF2j). The first condition states that the measurement result is not allowed to depend on 
the result of a (simultaneous) measurement made far away. That is, measurement results may 
not influence future measurement results instantaneously. The second condition expresses the 
idea that the measurement result of an observable may not depend on the experimental setup 
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used far away. Note that for OILOC states and separable observables, relation (99) holds. For 



such observables one can thus sensibly talk about a joint probability distribution by taking 

P c [(Ai G Ai) A (A 2 G A 2 )|A] := P C [A 2 G A 2 | Ax G Ai, A] F c [Ai G A X |A]. (107) 

To finally obtain a Bell-type inequality, a third implicit assumption is made, namely, that 
the state /i of the system does not depend on the measuring context C. The idea is that the 
state may have been prepared long before the choice of the measuring context was made. This 
is not so much an assumption of locality, but an assumption of free choice: the experimenter 
is free to choose the experimental set-up, independent of the history of the system. A truely 
deterministic theory may deny this assumption (although advocates of such a theory would 
probably not hope to find a solution in stochastic hidden variables anyway). This leads to the 
following formulation of the Bell inequality, again in terms of the experiment of Example |2.2| 



Lemma 2.15. Let o~ ri and a r ' denote the spin of the first particle along the n and r' t axes 
and let a r2 and a r i^ denote the spin of the second particle along the r 2 and r' 2 axes. Consider 
the measuring contexts 

C12 = {oYi,0V 2 }> C12' = {cr ri ,o- r > 2 }, Ci'2 = {oy^,0V 2 }, Cy 2 ' = {cr r / , a r > 2 }. (108) 
Then any macro state \x of the contextual stochastic hidden-variables theory that satisfies the 



locality claims (i) and (ii) of Definition 2.1 satisfies the following Bell-inequality: 
E Cl2 [cr ri a r2 \fi}-Ec r2 ,[o- ri a r > 2 \fi} + E Cl , 2 [<r r / a r2 + E Cl , 2 , [cr^cr^ |/i] < 2, 



(109) 



Proof: 

First, an expression for the expectation values is obtained using both locality assumptions: 

12 l^ri 0~T2 IH = / Ec |A]dA*(A) 
J A 

= / P Cl2 K = lA<j r2 = l|A]+P Cl2 [a ri = -lAa r2 = -l|A]d/;(A) 
Ja 



/ Pc 12 kr 

Ja 



1 A o> 



-i|A] + p Cl2 k, 



»1 



-1 A a, 



l'2 



l|A]d M (A) 



: Pci 2 Ki = 1 A a r , 2 = l\n] + Pci 2 [o" n = -1 A oy 2 = -l\a] 

- PCiakn = 1 A cr r2 = -l|/i] - Pc X2 [cr n = -1 A a T2 = l\fj] 
: Pci 2 Ki = l|cr r2 = l,/i]P Cl2 [o- r2 = +P Cl2 [cJ ri =-l|a r2 = -l,^]P Cl2 K: 



Pci 2 [0Yi = l\a r 



-1>M] PC12K 



-!N -PC12K1 = -lkr 2 = 1,M]PCi 2 [°V 



P Cl2 [a n = l\u]F Cl2 [a r2 = l\u]+F Cl2 [a ri = -l\fj]F Cl2 [a r2 = 
- PciaKi = l|/z]Pc M [0Va = ~MfA ~ P C 12 kn = Pd 2 [^r 2 = MlA 

'" 6 p K}k = p { CT r 2 }[°"r 2 = i|m] + P{ CTri }Ki = -i|HPk 2 }K 2 = -iH 

-P{ CTri }Ki = l|//]P{ CTr2 }[(T r2 =~l\fj] -P {CTri }[cr ri =-l\fj] P{a r2 }kr 2 = MfA 

P {CTn} [a n = 1| M ] - P Kl} K = -1| M ]) (P {CTr2 }kr 2 = l|/i] - P{, r2 }kr 2 = -i|m] 



(110) 
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Similar relations hold for the other expectation values. Next, introduce the functions 

fi(ji) := F {ar . } [a ri =l\n]- V {ari) [a n = 

■.= v Wr , } Wri = i|m] - v Wr , y [* ri = -i| M ], 



for i = 1,2. In terms of these functions, the inequality (109) reads 



\hWM - hMfLMl + |/((m)/ 2 (m) + /i(m)/2(m)| < 2. (ii2) 

That this inequality is satisfied follows almost immediately from < 1 and < 1 
for i = 1,2. Indeed, one has 

|/i(aO/ 2 (m) - /i(M)^(M)| + |/i(m)/ 2 (m) + /i(m)/ 2 G«)| 

< |/ 2 (M) - /2(M)| + |/2(m) + < 2. (113) 

This completes the proof. □ 



Corollary 2.16. There is no local, contextual stochastic hidden-variable theory that can repro- 
duce the statistics of the system of two spin-^ particles prepared in the state tp = ^(0, 1, — 1, 0) 
(that are space-like separated). Therefore, any such theory is empirically in violation with 
quantum mechanics. 

Remark 2.2. It has been shown (see for instance |Cir| . |Lanl|) that the maximal violation 
of the Bell-inequality is with a factor \/2 (like in equation (95)). That is, for any 4-tuple 
of operators A\ , A2 , A3 , A4 each of the form Ai = 2Pi — 1 with Pi a projection, such that 
[P u P 3 ] = [p 2) p 4 ] = and [Pi, P 2 ] / and [P 3 , P A ) ^ 0, it holds for every state t/> that 



lE^As) - E^A^)] + |E^,(A 2 A 3 ) + E^AiAi)] < 2^2. (114) 
Thus quantum mechanics itself satisfies an inequality similar to the Bell inequality. 

Discussion 



Theorem 2.13 and Corollary |2.16 are remarkable results. Whereas the theorems of von Neu- 
mann and Kochen &; Specker prove the incompatibility of hidden variables with quantum 
mechanics at an abstract level, the Bell inequality provides a distinction between local re- 
alist theories and quantum mechanics that is susceptible to experimental investigation. No 
wonder Shimony coined the term "experimental metaphysics" to describe related experimental 
research. The first experimental tests were performed by Freedman and Clausei]^] |FC| , but 
often the ones performed by Aspect [ADR] are seen as the decisive ones (in favor of quantum 
mechanics). 

Although it is the general consensus that the violation of the Bell inequality excludes the 
possibility of local realism, there are possible objections against this claim. For example, be- 



sides expressing a certain notion of locality, equation (76) may also be viewed as a consequence 



of free will. Indeed, it expresses the idea that the measurement result of Ai does not depend 



40 In fact, this experiment tested a modification of the Bell inequality that is due to Clauser, Home, Shimony 
and Holt (known as the CHSH inequality) [CHSHj . 
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on whether or not one chooses to use a measuring context in which A2 can be measured. This 
choice may therefore be considered to be free. Conversely, if one denies this form of free will, 
a violation of (76) is possible without allowing action at a distance. This is because from this 
point of view, the actual measuring context was already determined before the experiment 



was even thought of | There is therefore no need to demand certain relations between dif- 



ferent measuring contexts, since only one measuring context is actual. In particular, for the 
equation Ac (.4.1 ) = Ac\{_4 2 }(*4i) there is no reason to assume that both sides of (76) should 
be defined from the determinist point of view i.e., Ac need only be defined for the actual 
measuring context that is determined to be chosen. Some recent development made following 
this philosophy can be found in |tHl| and |BJS| . However, most scientists find this viewpoint 
'conspirational' and don't wish to abandon free will 

for the sake of local realism. I will come back to this discussion in Chapter [4] 
Certain other explicit and implicit assumptions made in the derivation of the Bell in- 
equality have also been subjected to criticism 42 see for example |HB] . [But], |Gis] and [Shi2j. 
The experimental tests that show the violation of Bell inequalities in Nature have also been 
criticized (c.f. [FRJ, |Gil| . |SanJ). Furtermore, for an investigation and criticism of the Bell 
inequalities from a probability theoretic point of view, one may refer to |Khr| . The entire 
criticism may perhaps best be summarized in the following way. 



"Between the metaphysical statement "a local realistic theory is impossible" and 
the actual experimental set-up there is a huge gap, which can only be bridged 
with the aid of many auxiliary hypotheses. Any one of these could be wrong. 
Proceeding from the experimental side, we can, for example, point out that there 
are "experimental loopholes" [. . .] in Aspect's experiments, which, if investigated 
further, might turn out to be responsible for the result. We can also suspect the 
existence of some "selection effect" which influences the detection probabilities, 
so that Aspect's experiments do not actually test Bell's inequality [...]. We can 
accept the possibility that some additional implicit assumption has entered into 
the mathematical derivation of Bell's inequality, or doubt that the mathemati- 
cal criteria used in this derivation are accurate and complete translations of the 
metaphysical concepts "realism" and "locality". And surely, there are many more 
possibilities, which we cannot see from within the network of present-day physical 
concepts." [BP] 

Indeed, one must always be careful about what conclusions are drawn from mathematical 
theorems. As will be shown in the next chapter, there may be creative ways to escape plausible 
assumptions, thereby rendering them implausible. And in Chapter [4] it will also be argued 
that no mathematical theorem at all can ever decide on the true nature of reality. 

However, as long as we are stuck with these present-day physical concepts, it seems that the 
only way to maintain the possibility of a hidden- variable theory is to accept either non-locality 
or absolute determinism (thereby denying free will). These two options are also unacceptable 
for me (as for most people) and the only remaining option seems to be to accept the strangeness 
of quantum mechanics, and try to make sense of the Copenhagen interpretation, or to find a 
better interpretation. In the end, it will turn out that a choice has to be made on what is 



See also the implicit assumption made just above the formulation of Lemma 2.15 
42 Some of them are not directly translatable to the notation used in this paper. It might be interesting to 
investigate whether or not they still apply here or if the notation used here leads to other possible objections. 
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to be expected from a physical theory and what is to be demanded from it. I will argue that 
if one wishes to make as few metaphysical assumptions as possible, a view must be adopted 
that explicitly incorporates the idea that not everything can be known. 



Introduction 
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3 The Alleged Nullification of the Kochen-Specker 
Theorem 

What is proved by impossibility proofs is lack of imagi- 
nation. 

- J. S. Bell 

3.1 Introduction 

It is a common phenomenon in any discussion on the foundations of some subject that the 
stronger a statement, the more creative the theories that oppose this statement. This is no 
different in the hidden-variable discussion. In 1999 Meyer [Mey| unleashed a discussion on 
the alleged 'nullification' of the Kochen-Specker Theorem, which seems to have resulted in a 
stalemate between Appleby on one side |App4| and Barrett and Kent on the other |BKj . The 
aim of this chapter is to give an overview of this discussion and assess whether or not the 
Kochen-Specker Theorem has been 'nullified'. Although [BKJ already gives an overview of the 
discussion, this overview is obviously prejudiced, and hence it seems worthwhile to present 
the story from the perspective of an outsider. 



3.2 The Nullification 

The loophole in the Kochen-Specker Theorem that Meyer found lies neither in the mathe- 
matical proof of the theorem, nor in the somewhat abstract FUNC rule, but in its use of the 
observable postulate. All proofs of the Kochen-Specker Theorem are based on finding a set 
of self-adjoint operators that cannot all be assigned a definite value in such a way that the 
FUNC rule is satisfied. A contradiction with quantum mechanics then arises if one assumes 
that these operators correspond to actual observables. This dubious assumption is the reversal 



of the observable postulate (Section 2.1). In fact, it is not clear at all why every self-adjoint 



operator (or any specific self-adjoint operator) should correspond to an observable. On the 



other hand, the operators that appear in the proof of Lemma 2.12 are generally accepted to 
correspond to observables (see also the discussion at the end of Section 2.3.3). 

But is it really necessary to consider these operators to be observables? More precisely, 
are all operators corresponding to squared spins along some axis empirically distinguishable? 
Meyer thinks they are not. According to him, "no experimental arrangement could be aligned 
to measure spin projections along coordinate axes specified within more than finite precision" 
|Mey]. I tend to agree with this. For small enough e, the squared spin along some axis r may 
well be indistinguishable from the squared spin along the r + e axis. Then Meyer continues 
to argue that all that has to be done to 'nullify' the Kochen-Specker Theorem is to find a set 
of observables that can be assigned values in accordance with the FUNC rule, such that the 
squared spin along any axis is indistinguishable from some observable in this set. 

The choice of the squared spin along some axis r is commonly identified with the point on 
the unit sphere S 2 in M 3 where the axis intersects with this sphere]^] The most natural subset 
of S 2 that results in a set of observables that are empirically indistinguishable from the set of 
observables when taking the entire S 2 , is S 2 fl Q 3 . Note that this is not the same as the set 
of all points in S 2 whose corresponding axes go through a point in Q 3 . For example, the axis 
through (1,1,0) enters the sphere in the points ±1 / v2(l, 1,0) ^ S 2 H Q 3 . It is, however, true 



3 In fact, for each axis there are two points on the sphere. Such points are considered to be equivalent. 
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that S' 2 nQ 3 is indeed dense in S 2 . It is in fact also true for higher dimensions (i.e. S n r\Q n+1 
is dense in S n , see for example (SchJ). 

To complete the argument of Meyer, the following proposition must be proven. 

Proposition 3.1. The set OBsfp of all squared spin observables along axes that intersect the 
set 5 2 fl0 3 can be assigned definite values in accordance with the FUNC rule. More explicitly, 
there exists a map f : S 2 D Q 3 — > {0, 1} such that for all x, x' , x" G5 2 nQ 3 one has 

f(x) = f(-x); (115a) 

ifxlx', then f(x) + f{x') > 1; (115b) 
if X-Lx' -Lx" -Lx, then f(x) + f(x') + f(x) = 2. (115c) 

The proof I will present here is based on the one given by Havlicek, Krenn, Summhammer 
and Svozil in [HKSSJ. It is based on the following lemma, which is also proven in [HKSSJ. 

Lemma 3.2. An axis intersects S 2 n <Q> 3 if and only if it intersects the set of all triples of 
integers that satisfy the Pythagorean property; 

Zpyth ■= {fa, y> z) G Z 3 \{0} ; x 2 + y 2 + z 2 = n 2 , n G N}. (116) 

Proof: 

For the 'only if, suppose \Jj^,^}^) £ S 2 HQ 3 . Let n G N be the least common multiple 
(lcm) of mi , mo, m?. Then n ■ ( — , — , — ) G Z 3 lies on the same axis as ( — , — , — ) and 



n^V + (n^Y + (n^Y = n 2 ( ^ Y + f ^ V + ( !*Y ) = n 2 (117) 

mi J V m 2/ V m 3/ \\ m lJ \ m 2/ V m 3/ I 

For the 'if, suppose (x,y,z) G l? Pyth . Then -(x,y,z) G S" 2 Pi Q 3 , and it lies on the same 
axis as (x, y, z). □ 



Proof of Proposition \3.1\ 

For an y point (,,,,,) 6 Z^. let (.<,,<,.<) . (_ 3? J_ 5 ,_ 3fcy ,_ 3;f _ J ) wher e g cd 
stands for greatest common divisor. This is again a point in 7? Pyth . It follows that precisely 
one of the numbers x', y' or z' must be odd. This can be seen as follows. Let n 2 = x' 2 +y 12 + z' 2 . 
If n is even, then n 2 = 0[mod4]. Because not all x',y' and z' can be even (since then their 
greatest common divisor would equal 1), precisely two must be odd. Suppose y' and z 1 are 
odd. Since y' 2 — 1 = (y' — l)(y' + 1) is the product of two even numbers, y' 2 = l[mod4], and 
similarly z' 2 = l[mod4]. But this implies that 

2 = n 2 -x' 2 - (y' 2 - 1) - (z' 2 - 1) = 0[mod4], (118) 

which is a contradiction. Therefore, n cannot be even. If n is odd, then either x',y', z' are all 
odd, or precisely one of them is odd. If they are all odd, then x' 2 + y' 2 + z' 2 = 3[mod4], which 
leads to a contradiction, since n 2 = l[mod4]. Therefore, precisely one of the x',y',z' must be 
odd. 
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This leads to the definition of the function 

/:S<W-{0,1}, /K^," 3 

\mi m2 7tt. 3 



lcm(mi,m 2 ,m 3 ) rt 3 - g 

gcd(ni,ri2,n3) "13 ' 
lcm(m,i,m 2 ,m 3 ) 713 
gcd(ni,ri2,n3) TO3 



is even. 



(119) 



To show that this function satisfies (115), note that the map 

q , ni n 2 n 3 \ lcm(mi,m 2 , m 3 ) / ni n 2 ra 3 
m\ m-i mz) gcd(ni, n 2 , n 3 ) \mi'm 2 'm 3 



(120) 



takes elements of 5' 2 nQ 3 to elements (x, y, 0) 6 ^% y th^ with gcd(x, y, z) = 1. Condition ( 115a 



immediately follows from the definition of /. Condition (115b) will follow from (115c), since 



if x, x' £ S 2 Pi Q 3 with X-Lx', then 1 x 1' £ S 2 fl Q 3 , where x denotes the exterior product. 

0, it must 



So the only thing left to show is (115c 



Suppose x, x' , x" is a triple of mutually orthogonal vectors in S 2 HQ 3 . If f{x) 
be shown that f(x') = f(x") = 1. For this, set 



n\ n 2 n 3 \ 

nil ' m 2 ' 7713 / 



777,1 777n m 



< 








11 ' 

777-2 


m 3 



(121^ 



and 



5(x) = (xi,x 2 ,x 3 ), S(a/) = (a/ 1)a / 2 ,4), S{x") = (x",x'i,x'i). (122) 

If f(x) = 0, then x 3 is odd and x\ and x 2 are both even. Furthermore, since X\x'^ + x 2 x 2 + 
£3X3 = 0, the number 

j xix' 1 + yiy[ 



•f i 



(123) 



must be even too, and so f{x') = 1. Similarly, f{x") = 1. 

Secondly, it must be shown that if /(x) = /(x') = 1, then f{x") = 0. Note that 



/(*") 



0, x 3 is odd; 
1, 



(124) 



However, because /(x) = /(x') = 1, X3 and x 3 are even and x±, x 2 , x' x andx' 2 are odd. Suppose 
now that x 3 is even. If x'[ is even, it follows that x 2 = ^ [x\x'[ + X3X 3 ) is also even. Similarly, 
if x 2 is even, x'{ must be even. Thus if x 3 is even, x'{,x'i,x'i must all be even which is a 
contradiction. Therefore, x 3 must be odd and hence f(x") = 0. 

This completes the proof. □ 



Meyer suggests that any measurement of a squared spin along some axis r eventually 
results in the measurement of the squared spin along some axis r', close to r, that intersects 
with S 2 flQ 3 . Also, the selection of the direction r 1 occurs in such a way that if one measures 
the squared spin along three orthogonal axes x,y,z, actually the spin along some orthogonal 
axes x',y',z' are measured, where x',y',z' intersect with S 2 H Q 3 and are close to the axes 
x, y, z. This makes the actual measurement empirically indistinguishable from the intended 
measurement, and (115c ) ensures that the measurement result is in agreement with predictions 
made by quantum mechanics. This is Meyer's alleged 'nullification' of the Kochen-Specker 
Theorem. 
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Of course, the fact that only rational vectors are considered in the discussion above is 
not that relevant. The same arguments can be used for any other dense, so-called colorable 
subset of S 2 . Not much later, Kent |Ken] showed that dense colorable sets not only exist 
in the three-dimensional case, but in fact for any finite-dimensional Hilbert space. He also 
generalized the scheme to so-called positive operator valued measurements. These models 
have become known as MK-models. For the present discussion it is enough for the time being 
to simply know that these extensions exist; in any case, the main discussion can be focused 
on the three-dimensional case. 



3.3 First Critics 

The number of articles that criticize Meyer's paper is substantial. Apparently, new hope for 
hidden variables is not warmly welcomed. This came as a bit of a surprise to me, since many 
of these critics do agree that there are fundamental problems with quantum mechanics. It 
seems as though some people are afraid that some problems actually might have a possible 
solution. I will only discuss a selection of these articles, hoping not to leave out too many 
interesting comments. 

The claim of Meyer that the Kochen-Specker Theorem has been nullified leads to the 
question what it is exactly that the Kochen-Specker Theorem states. A common notion is 
that the theorem states that (at any given time) not all observables can be assigned definite 
values that are independent of the measuring context. The necessary assumption to arrive 
at this conclusion is that observables obey natural functional relationships. The way this is 
proved, is by selecting a specific set of observables and showing that any assignment of values 
to those observables contradicts the functional relationships between those observables; see 
Section 12.3.31 



3.3.1 Non-Linearity of the MK-Models 

Cabello's first comment |Cabl| is quite straightforward, but seems to rely on a misinterpre- 
tation of the MK-models. In his view, each self-adjoint operator or each projection operator 
corresponds with a possible measurement. The Kochen-Specker Theorem then states that it 
is impossible "to conceive a hidden variable model in which the outcomes of all measurements 
are pre-determined". 

From this point of view, Cabello is right. But clearly, the whole point of the finite precision 
discussion is that the 1-to-l correspondence between observables and self-adjoint operators 
is not a necessary one. Cabello only focuses briefly on this viewpoint, but arrives at the 
conclusion that this cannot be what is meant in the MK-models since "this loophole would 
have very weird consequences." For one, it would imply that the superposition principle is no 
longer valid. Indeed, in the three-dimensional case (in the Meyer model) the operators a 2 and 
(Ty correspond to observables. But the operator <J 2 +y doesn't, since the x + y-axis intersects 
S 2 in the points ±^(1,1,0) ^ S 2 n Q 3 . However, the superposition principle 44 , although 
a noticable aspect of quantum theory, cannot be a necessary criterion for a hidden variables 

44 Usually the superposition principle is only postulated for states. However, each of these observables 
can be associated with a one-dimensional projection 1 — crj?- Since there is a one-to-one correspondence be- 
tween (equivalence classes of pure) states and one-dimensional projections, one may therefore argue that the 
superposition principle should also hold for these operators. 
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theory (as is also noted in [BKJ). One may even argue that it is a rather vague aspect of 
quantum mechanics altogether. 

3.3.2 Contextuality of the MK-Models 

Also Mermin |Mer2| recognizes that a good understanding of the Kochen-Specker Theorem 
is necessary in order to get the discussion going. To relate this theorem to finite precision 
measurements, he restates it in the following way 



15 



Suppose £Iks = {Ai, • • • j A n } is a finite uncolorable set of observables (each corresponding 
to an operator with finite spectrum) and let Cks be the set of all subsets of £Iks whose elements 
correspond to mutually commuting observables. That is, for each set {Ai x , . . . , Ai k } € Cks 
the observables Ai 1 , . . . , A% k can be measured simultaneously according to quantum theory. 
The set of all definite value assignments is again given by 

A = {A : n KS -> M ; A (A) € a(Ai),i = 1, . . . , n}. (125) 

Mermin then argues that the Kochen-Specker Theorem implies that for each probability mea- 



sure P on this space 46 there is a subset Cks of mutually measurable observables and a value 
assignment A such that P(A) > and such that A restricted to the set Cks gives a valuation 
that is in conflict with the supposed functional relationships between these observables. In 
terms of the three-dimensional case: there is always a finite probability to measure the squared 
spin in three orthogonal directions and not find one of the outcomes (1,1,0), (1,0,1) or (0,1,1) 
(for a particular choice of the three directions). 

Certainly, this is a legitimate restatement of the Kochen-Specker Theorem, but already its 
opening assumption is susceptible to the criticism of Meyer and Kent, who would deny Qks 
to be a set of observables. However, there are sets of observables that are arbitrarily close (in 
some sense) to the objects in the set £Iks, an d this is where Mermin seeks a loophole in the 
argument of Meyer and Kent. In his own words: 

". ..the KS theorem is not nullified by the finite precision of real experimental 
setups because of the fundamental physical requirement that probabilities of out- 
comes of real experiments vary only slightly under slight variations in the configu- 
ration of the experimental apparatus, and because the import of the theorem can 
be stated in terms of whether certain outcomes never occur, or occur a definite 
nonzero fraction of the time in a set of randomly chosen ideal experiments." |Mer2| 
p. 3] 

Here Mermin has smuggled in a new assumption that was not part of the original Kochen- 
Specker Theorem, namely, that the probability of finding a certain measurement result depends 
continuously on the experimental setup. Thus, if anything, Mermin has only showed that given 
his continuity assumption the Kochen en Specker theorem is not nullified by Meyer and Kent. 
In the case of the spin-1 particle, this condition would probably look something like this: 

Assumption 3.1. Let of denote the squared spin along the r-axis, with ||r|| = 1. Then 
Mermin's continuity assumption implies that for each state of the system (characterized by 
the probability measure P), and for each r 6 S 2 , for each e > 0, there is a S > such that 

|P[a* = 1] -P[a£ = 1]| < e (126) 

45 Here, somewhat more mathematical terms are used than in |Mer2) in order to obtain a better view. 
46 For a (T-algebra one may take the power set. 
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for all r' E S 2 with \\r' — r\\ < 6. 

This assumption is indeed satisfied in quantum mechanics. It remains to be investigated 
if the Meyer and Kent models can satisfy this assumption and whether or not this is a 'silly' 
assumption. 

Mermin states that without his continuity assumption, any physical theory capable of 
dealing with finite-precision measurements would be quite useless. It should be noted, however, 



that the assumption as stated in equation ( 126 ) is of a purely theoretical nature, which is not 



susceptible to any experimental investigation. Indeed, when trying to determine whether the 
continuity criterion is satisfied, r and r' have to be chosen arbitrarily close to each other. A 
measurement of a 2 then becomes indistinguishable from a measurement of a 2 , and there is no 
way to compare the frequencies with which [a 2 = 1] or [a 2 , = 1] occur. Certainly, pragmatically 
speaking, any investigation of this condition leads to a verification of this condition and 
therefore one may consider it to be an a priori condition of scientific theories. But it is a 
metaphysical question if this experimental verification is a result of the condition actually 
being true (since the condition does not satisfy the falsification principle). 

It follows that outcomes of measurements of observables arbitrarily close to those in an 
uncolorable set must statistically resemble the outcomes predicted for the uncolorable observ- 
ables predicted by quantum mechanics. But one does not need this assumption, since quantum 
mechanics already predicts certain probabilities for the colorable observables. The only as- 
sumption needed is that the hidden variables theory reproduces those statistics, in which case 
the continuity criterion will automatically follow for the observables in the MK-models. In 
this sense the Meyer-Kent models are incomplete, since no statistical behavior is specified in 



these models.^ However, incomplete as they are, Mermin claims that is in principle impos- 
sible for the Meyer-Kent models to reproduce the statistics of quantum mechanics without 
re-introducing contextuality. 

To state his argument for this claim, consider again an uncolorable set £Irs = {Ai, ■ ■ ■ , A n }. 
For each observable Ai in this set, there is an observable A\ in the colorable set that is em- 
pirically indistinguishable from A%. This gives a set Q' KS = {A[, . . . ,A' n }. Mermin cor- 
rectly notices that it is impossible to construct this set in such a way that for each set 
C = {Ai x , ■ ■ ■ ,Ai k } E CkSi the set C = {A'^, . . . ,A' ik } is again a set of observables cor- 
responding to mutually commuting operators. Indeed, if this were possible, any coloring of C 
would automatically yield a coloring of C, which is assumed to be impossible. His conclusion 
is the following: 

"[This] deficiency makes the MK set [^j^sl useless for specifying preassigned non- 
contextual values agreeing with quantum mechanics for the outcomes of every one 
of the slightly imperfect experiments that corresponds to measuring a mutually 
commuting subset [some C E Cks] of observables from the ideal KS uncolorable 
set [SIksV |Mct21 p. 2] 

It seems a strange assumption that each time one intends to measure the observable Ai, in 
fact the same observable A\ is actually measured. Within the finite precision of measurement, 
however, there are countably many observables A\ that may be measured if one intends to 

47 At this point, notice that the discussion is diverting from the possible existence of non-contextual hidden 
variables (which is claimed to be impossible by the Kochen-Specker Theorem) to the question if such a hidden 
variables theory can reproduce the statistics of quantum mechanics (about which the Kochen-Specker Theorem 
per se is silent). 
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measure A%. The hidden assumption that Mermin makes is that the Meyer-Kent models have 
the property that whenever one attempts to measure Ai, the observable actually measured is 
always the same A\. But it is very likely that the measurement of A\ that is actually performed 
will fluctuate in time within the boundaries of the finite precision with which the measurement 
is set up. However, against this line of reasoning Mermin brings forth the following objection: 

"If one tries to bridge this gap in the argument by associating more than a single 
nearby MK colorable observable with some of the observables in the ideal uncol- 
orable set, one sacrifices the non-contextuality of the value assignments." |Mer2| 
p. 2] 

From a quantum-mechanical perspective, this is indeed true. If A appears in two measurement 
contexts C\ and C 2 , in the MK-model it may be associated with several different observables 
A'i G C[ and A 2 £ C 2 , depending on the context in which one wishes to measure A. However, 
from the point of view of the MK-model, A[ and A 2 are two distinct observables and there is 
no reason why they should be assigned the same value, or even values close to each other. Not 
even an appeal to Mermin's continuity assumption helps at this point, for the only requirement 
is that the distributions over the value assignments for these two observables resemble each 
other. 



3.4 The Statistics of MKC-Models 

As noted, the critique of Mermin focuses on the question whether or not the statistics of quan- 
tum mechanics can be reproduced by an MK-model. Clifton and Kent |CK1| recognized this 
shortcoming and presented a modified non-contextual hidden- variable theory that is supposed 
to be able to reproduce the right statistical behavior. These modified models are known as 
MKC-models. 

The purpose of Clifton and Kent is to find a subset Vck(H) C Vi^H) that is colorable (i.e. 
there exists a valuation function) and dense in V(T-L), where % is a finite-dimensional Hilbert 
space. That is, for each P € V{%) and for each e > 0, there is a P' £ Vck{7~L) such that 
|| P — P'\\ < e, where ||.|| denotes the operator norm. Furthermore, it is required that the set 



of resolutions of the identit} 48 generated by Pcx(%) is dense in the set of all resolutions of 
the identity. That is, for each resolution {Pi, . . . , P n } of the identity and each e > 0, there is a 
resolution of the identity {P{, . . . , P^} C Vcxi'H) such that ||Pj — P/|| < e for all i = 1, . . . , re. 

This is sufficient to ensure that for any self-adjoint operator A, an observable in the 
MKC-model can be found that is empirically indistinguishable from A. Indeed, for a finite- 
dimensional Hilbert space T~L of dimension re, each self-adjoint operator A has a spectral 
decomposition of the form 

A= y, aP *> ( 127 ) 

a£a(A) 

where #a(A) < n and (P a ) a ea(A) is a set of pairwise orthogonal projection operators that 
sum up to the unit operator 1. Then, for each e > 0, there are pairwise orthogonal projection 
operators P' a £ Vck(T~1-), a £ a (A) such that ||P a — P^|| < e/nsup{|a| ; a G a(A)}. The 



48 A resolution of the identity is a sequence (Pi) of pairwise orthogonal projection operators such that 
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operatoi 
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A' := Ylaecr(A) a -^o ma y then be seen as an observabk 50 



\A' - A\\ < e. 



in the MKC-model, and 



it satisfies 

In order to define statistical behavior, it is not enough that the set Vck(T~(-) is colorable; 
in order to reproduce the predictions of quantum mechanics, it also needs to allow sufficiently 
many different colorings. To explain this, some notation will be introduced. 

For an orthonormal basis (ej)" =1 , let 

Vi((e i )) = {P ei ,...,P en } (128) 

denote the set of one-dimensional projections on the basis vectors and let V {{&%)) be the 
set of all projections that project on subspaces of T~L spanned by some subset of {ei, . . . , e n } 
(i.e. V {{ei)) is the Boolean algebra generated by V\ ((e^))). Note that j$-V\ ((e^)) = n and 
WP {{ e i)) = 2 n - A valuation function on V {{e%i) is completely determined by assigning the 
value 1 to precisely one of the P ei (and hence, it always exists). 

Definition 3.1. Two (orthonormal) bases (ej)™ =1 and (e^)™ =1 are called totally incompatible 
if [P,P'] / for all P G V {{a)) and P' G V ((e^)) with P,P' <£ {0,1}. That is, for all 
P G V {{ei)) and P' e V ((ej», PUdP'U implies P = O or P' = 1. 

The central theorem of Clifton and Kent |CKlj can then be formulated as follows. 

Theorem 3.3. For each finite-dimensional Hilbert space %, there is a countable set of or- 
thonormal bases {(e^JLi, {e\ 2) )f =1 , (ef } )f =1 , . . .} that are pairwise totally incompatible, such 
that 

oo 

, y K C^ 

m=l 

is dense in V{%) and such that the set of resolutions of the identity generated by VcK{T~i) is 
dense in the set of all resolutions of the identity. 



Vck{U) ={JV ((efM (129) 



The proof given by Clifton and Kent is not very illuminating and therefore I omit it 
The following lemma states that this set is in fact colorable n 
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Lemma 3.4. For each f : N — > {1, . . . ,n}, the function Xf : Vck^H) — > {0, 1} given by 

fl, ifPcm) <P; , , x x 

X f {P) = < e /(™) forPGV [{el m) )) , m G N (130) 

I 0, otherwise 

is well-defined and defines a valuation function. Moreover, all valuation functions on Vck(7~L) 
are of this form and the correspondence is bijective. 

Proof: 

To see that the function is well-defined, note that for each P G VcFcffl) (unequal to © and 
1) there is exactly one m such that P G V (^{e^)^j because the bases are pairwise totally 
incompatible. 



49 If sup{|a| ; a £ o{A)} = 0, then A = © and one may simply take A' = eP' for any P' G Vck^H). 
50 Throughout the remainder of this section, no distinction in notation between observables and their cor- 
responding operators will be made. 

51 Especially because they do not construct Vck{H) explicitely but rely on an existence proof. 
52 The set of natural numbers N is here taken to exclude 0, i.e., N = {1, 2, 3, . . .}. 
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Recall that a valuation function satisfies the finite sum rule 53 for all projection operators 
in Vck^H)- in fact, it is sufficient to only show the the finite sum rule is satisfied in order 
to show that A/ is a valuation function (see also [FT) ) , Now suppose {Pi, . . . , P^} is a subset 
of 'Pck(H)\{®, 1} such that Y2i=i Pi = Then all these projection operators commute. To 
see this, let Pj and P, be any two projections in this set and let P^ = Xw=i Pi- Then 

P h + P i + P j = t = t 2 = (P h + p + P,) 2 

= pi + p 2 + p 2 + p h (Pi + Pj) + (p + p 3 )p h + p^ + PjPi (131) 
= p h + p + Pj + p h (p + Pj) + (P + p 3 )p h + ppj + p,p. 

Subtracting P^ + Pj + Pj from both sides leads to 

P h (Pi + P,) + (P + Pj)P h + PiPj + PjPi = . (132) 

Because Pj + Pj = 1 — P^, this results in PjPj = — P,Pj. Finally, it follows that 

PjPj = PiPjPj = PjP^Pj = PjPjPi = PjPj. (133) 

Therefore, since all bases are totally incompatible, there is a unique m 6 N such that 
{Pi, . . . , Pk} C V (^(e[ m ^)^j ■ Then, since Xf restricted to each V (^{e^)^j is a valuation func- 
tion, it follows that the finite sum rule holds. This shows that A/ is indeed a valuation function 
on VckCH). 

To show that all valuation functions are of this form and the correspondence is one-to-one, 
let A be an arbitrary valuation function. For each basis (e-"^)™ =1 it holds that Yli=i P ( m ) = !■ 
Therefore, because of the finite sum rule, f\(m) may be defined to be the unique element k of 
{1, . . . , n} with A ( P ( m ) ) =1. Because of the finite sum rule, one must have that if P < P', 



then A(P) < A(P'). Therefore, (130) is automatically satisfied and A = A/ A . □ 



In the corresponding hidden-variable theory, the set of pure states A, is the set of all 
valuation functions on VcKiJPj- The above lemma establishes that A ~ {1, . . . ,n} N . These 
pure states, the hidden variables, are suspected not to be empirically accessible. Instead, the 
system is described by selected probability distributions on the pure states. Each projection 
operator P £ Vck{T~L) reappears in the hidden- variable theory as a stochastic variable P : 
A — > {0, 1} given by 

P(A):=A(P). (134) 

To show that the hidden-variable model can reproduce the statistical behavior of quantum 
theory, the following theorem must be proven. 
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Theorem 3.5. For each density operator p on the Hilbert space %, there is a probability 

P P [P = l} = Tr(pP) (135) 



measure P p on A such that 



for each P G Tok^H), where [P = 1] = {A € A ; A(P) = 1} = p- 1 ({l}). 



53 See Corollary [2/7] 

54 The claim of this theorem is not stated very clearly in [CK1 and an actual proof is lacking. Possibly it 



is one of the sources of confusion in the later criticisms on the MKC-models. 



52 



The Alleged Nullification of the Kochen-Specker Theorem 



Proof: 

In order to prove the existence of the probability measure, A ~ {1, . . . , n} N has to be turned 
into a measurable space first. For a finite sequence t\,.. .,t k of natural numbers (not neces- 
sarily in increasing order) and a sequence B\, . . . , B k of subsets of {1, . . . , n}, define 

S(ii, . . . ,t k ; B x , . . . ,B k ) := {A/ € A ; f(h) G B\, , . . , f(t k ) G B k }. (136) 

For a fixed sequence ti, ... ,t k , let E(ii, . . . ,t k ) be the cr-algebra generated by all these so- 
called cylinder sets. Further, let S be the smallest cr-algebra that contains all these <r-algebras, 
i.e. 

E = ff(E(ti,...,t fc ); {h,...,t k } cN). (137) 

Note that E(ii,... 3 ifc) has in fact only finite many elements and that it is isomorphic (as 
a set) to the power set of {1, . . . , n}'- tl ''"' tk * . Therefore, a probability measure on the space 
(A, E(ti, . . . , t k )) is completely defined by its action on the sets that are equivalent to a sin- 
gleton subset of {1, . . . , n}^ 1 '•"'* fe J', i.e. the sets of the form 

s(t 1 ,...,t k ;j 1 ,...,j k ) := {Xj G A ; = j { for i = l,...,k}. (138) 



Now let a density operator p be given. For each finite sequence t\, . . . , t k define a proba- 
bility measure Pp t t lt ...,t k on the space (A, E(ti, . . . , tfc)) by 

fc 

i=l 

for each sequence ji, . . • , j k in {1, • • • , n}. 

It is easy to see that these probability measures satisfy the following consistency criteria: 

(i) For any finite sequence ti, . . . , t k and all permutations (t[, . . . , t' k ) = (i^m, • • . , t n r k \) one 
has 

p P ,t' 1 ,...,t' k [s{t' 1 , . . . ,4; j n{1) , . . .,jn(k))} = Pp,ti,...,ijs(*l> • • • ,t k ;ji, ■ • -Jk)]- (140) 



(ii) For each finite sequence t%, . . . , t k , t k +i, the measure P Pt ti,...,t k ,t k+1 ac ts as IP P} t u ...,t k on 
every set in E(ti, . . . ,t k ), i.e. P p , tl tfc ,t H1 (5) = Pp, tl ,..., ffc (5) for all S G E(ii, . . .,t k ). 

Then, according to the Kolmogorov's extension theorem (see for example Theorem 10.1 in 
[BWJ), there exists a probability measure P p on the space (A, E) such that P p acts as P p j 1 ... 
on the sets in S(ti, . . . , t k ), for each finite sequence ti, . . . ,t k . 

The only thing left to show is that P p satisfies (135). For one-dimensional projection this 
follows almost immediately: 



V p [P< m) = 1] = P P ({A / G A ; /(/) = m}) = Pp.iUA/ G A ; f(l) = m}) = Tv(pP {m) ). (141) 

And also for the zero and unit operator ( |135 1 follows immediately since 

p p [0 = 1] = P p (0) = and P p [l = 1] = P p (A) = 1. (142) 
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For all the other P G 'Pcxi'H), there is exactly one m G N with P G P ( (e^ ^) ) . Therefore, 



[P = i] = {A G A ; A(P) = 1} = |J |J {A g A ; A(P = l,P e « < p\ 

i=i*=i fc fc 

.", r l i n , ( 143 ) 

= U{AgA; A(P e ( m) ) = l J P s ( m) <Pj= |J [P( m) =l], 

fe=l ' ' k 1 

e fe 

where in the third step I used the property that all the bases are totally incompatible. Since 
this is a union of disjoint sets, it follows that 

n n n 

P P [P=1}= V p [P eim) =l}= Tr(pP e ( m) ) =Tr(p £ P^ 

P k (^) l <P P(rn)<P P(^)<P ( 144 ) 

e fc e fe e fc 

= Tr (pP) . 

This completes the proof. □ 



Putting the result of Theorem 3.5 back into the definition of the measure ~P p , it follows 
that 



P P [P W =lW = l,...,oo]=P p ({A / }) = nP p [P eW =1]. (145) 



"/CO i /CO 



Consequently, 



Corollary 3.6. With repsect to any of the probability measures P p; two projection operators 
P\,P2 G VckCH) are independent as stochastic variables P\,p2 if and only if they do not 
commute. 

It follows from Theorem |3.5| that the non-contextual hidden-variable theory defined by 



Lemma 3.4 can indeed reproduce the statistical predictions of quantum theory with arbitrary 
precision. To see this, consider a quantum system in the state p and let A be a self-adjoint 
operator with spectral decomposition A = ^aeo-(A) a Pa- The quantum-mechanical probability 
for finding the value a is then given by Tr(pP ). 

It may be that not all the P a lie in "Pck(T~Q- But for every e > there exists a 



(nonuniqut 55 ) A' = Eae*(A) a K such that P' a G Pck(H), Eae.(A) K = 1 and \\A - A'\\ < e 



(see the discussion just above Definition 3.1). Then, if one argues that a measurement of 



A in fact comes down to the measurement of A', one finds the value a with a probability 
Pp[P„ = 1] = Ti{pP' a ) that is close to the probability predicted by quantum mechanics: 

\Tr(pP' a ) - Tr(pP a )\ = |Tr(p(P^ - P a ))\ < \\P' a - P a \\ Tr(p) < e. (146) 

In fact, the more precise the measurements become (i.e., the smaller e becomes), the more the 
probabilities obtained will resemble those predicted by quantum mechanics. This establishes, 
in particular, that the continuity criterion of Mermin (Section |3.3.2 ) is in fact satisfied by the 
MKC-models. 



55 It is of course not necessary to take a G o-(A). One may also take an other set cr{A)' such that a' is close 
to a whenever P' a is close to P a . However, this is not very convenient notation- wise and it is already sufficient 
to only consider taking a £ cr(A). 
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3.5 Further Criticism 

3.5.1 An Empirical Discrepancy with Quantum Mechanics (Part I) 

In [Cab2j, Cabello claims exactly the opposite of Clifton and Kent. Namely, he states that 
neither the model of Meyer, nor the models of Clifton and Kent can reproduce the statistics 
of quantum mechanics. It is not surprising that Meyer's model does not posses this property, 



for in the proof of Proposition 3.1 only one possible coloring of the set S 2 HQ 3 was described. 
Without much use of imagination, this result can be extended to obtain three different col- 
orings, but surely the richness of quantum stochastics cannot be reproduced by a probability 
space with only three elements. Also, it seems to me pointless to falsify a claim that was never 
made, certainly since the absence of a statistical part of the models of Meyer and Kent was 
one of the main motivations for Clifton and Kent to construct their models. 

The proof Clifton and Kent gave to show that their models can reproduce the statistics of 
quantum theory seems convincing, and it is a shame that Cabello did not point out what he 
thought was wrong with this proof. It seems that Cabello was probably puzzled when Clifton 
and Kent stated that 

". . . it should already be clear that the set of truth valuations [. . . ] will be suffi- 
ciently rich to recover the statistics of any quantum state by averaging over the 
values of the hidden variables that determine the various truth valuations." |CK1| 
p. 6] 

This seemed indeed a puzzling statement, but I think it should have been clarified now by 
Theorem 13.51 

Instead, Cabello focuses on the statistical behavior of two specific observables. Consider 
again the sphere S 2 and let P be a dense subset for the MKC-model. That is, T> is associated 
with the one-dimensional projections in Vck(C 3 ). Let P e denote the projection on the line 
spanned by some vector e and consider e\ = 7^(1 5 lj 1) an d e 2 = 7/3 (1> x ' ~~ -0- Cabello now 
argues that for each probability measure P on the set A of pure states (i.e., colorings of the 
set V C k(C 3 )), the probability P[P e / = P e > = 1] can be made arbitrary small as long as e\ 
and e' 2 are taken close enough to c\ and 62- In more precise terms: 

Lemma 3.7. For each e > 0, there exists a 5 > 0, such that for all P e / , P e ^ E Vck(^) with 
\\ e i ~ e i\\ < ^ ft = 1' 2j one has 

P[P 4 = P e , = 1] < e (147) 

for all probability measures P on A. 

Before turning to the proof, it is useful to ask why this property is supposed to be in 
conflict with quantum mechanics. In the usual interpretation, the observables associated with 
the operators P ei and P e2 cannot be measured simultaneously (since the operators do not 
commute). The empirical result of finding the result 1 for both measurements observables 
must therefore be obtained by performing the experiments after each other, which calls for 
the use of the von Neumann postulate or some other postulate that describes the dynamics 
of measurements. To avoid this discussion, consider a system that is already described by the 
state e\. If one then first measures P ei and then P e2 , the probability of finding for both the 
value 1 is given by the transition probability 

P QA /[P ei = P e2 = 1] = (ei,P e2 ei) = |(ei,e 2 )| 2 = §. (148) 
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So if Lemma |3.7| holds, this does seem a reasonable proof for the incapability of the MKC- 
models to reproduce quantum predictions. Hence a careful investigation of Cabello's proof is 
called for. 



Proof of Lemma 3. 7: 

Consider the following two orthonormal bases of C 3 : 

h = ^(0,1,-1), } 2 = jL/2(0,l,l), h = (1,0,0); 

2 I (149) 

9l = -^2(1,0,-1), g 2 = -V2(l,0,l), g 3 = (0,1,0). 

These satisfy fx _L e± J_ g±, f 2 J_ e% _L g 2 and f% _L g 3 . Consequently, for every e' , e" > two 
orthonormal bases (f[, f 2 , fy), (g'i, g' 2 , 93) can be chosen that satisfy the following criteria: 

• Pf>,P g > E ^c^(^ 3 ) for i = 1,2,3. 

• H-Pft-PeiH < e', ||P s /P ei || < e', ||Py/P e2 || < e', ||P g /P e2 || < e'. 

• \\PfP g >\\ < e". 

II J 3 93 1 1 

Indeed, this is done by choosing the bases close enough to the bases f 2 , fs) and (gi, g 2 , 93)- 

Now, in the MKC-model, an attempted measurement of P ei leads to an actual measure- 
ment of P e i and similarly, an attempted measurement of P e2 leads to an actual measurement 
of P e ' a with \\e{ — e^|| < 5, where S > denotes the precision of the measurement. Since 5 is 
controlled by the experimenter, it can be taken arbitrarily small. 

Consider now the set of all A € A that satisfy A(P e / ) = A(P e ^) = 1. Cabello argues that e' 
and 5 can be taken small enough such that for any probability measure P that is supposed to 
describe a quantum state, one has 

P[P f , = l\P e{ = P e , = 1] < \e; F[P g , = l\P e[ = P e , = 1] < -U 

\ - \ (150) 

F[P f , = l|P e , = P e , = 1] < -e; F[P g , = l|P e , = P e , = 1] < -e, 

because each of the pairs (f[,e'i), (9ii e i)i (f 2 i e> 2 ) an< i (9 2 i e ' 2 ) are arbitrarily close to being 



orthogonal (by choosing e' and 5 sufficiently small). Therefore, for 'almost all 56 A £ A 
with A(P e ,) = A(P e ,) = 1, one should have \(P f ,) = \(P f = X(P g[ ) = A(P^) = 0. For 
these A one has A(P^) = A(P^) = 1. But, if e" is taken small enough, one should have 
P[P/^ = P g ' 3 = 1] < e/2, because f'^g'^ are almost orthogonal. 
To recapitulate, one has 

P[P 4 = P e / = 1] = P[P e , = P e , 2 = 1, Pf, = P ff / = 1] 

+ P[P e / = P e>2 = 1, P f , = or P g , = 0] 

< F[P f , = P g , = 1] + P[P f , = or P g , = 0\P e , = P e , 2 = 1] 

<\e + F[P f , = l|Pei = Pe' 2 = 1] + nPg[ = ^ = P e > = 1] (151) 

+ F[P f , = l|P e , = P 4 = 1] + P[P g , = l|P e , = P e , 2 = 1] 

11111 

<-e + -e + -e + -e + -e = e. 
"28888 



56 This is to be considered with respect to P, but not in the usual measure-theoretic sense. Cabello's 
argument becomes a bit obscure at this point. 
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□ 



This proof however, relies on an implicit assumption of continuity. Although the allowed 



probability distributions in the MKC-models (i.e. the ones of Theorem 3.5 ) are continuous in 
a certain sense, the pure states are not. Indeed, in the previous section it was shown that for 
every P e € Vck(H), f° r every e > there is a 5 > such that 

|P[P e = 1] -P[P e ' = 1]| < e (152) 

for all e' with [|e — e'|| < 6 (this resembles the continuity criterion of Mermin). However, this 
does not imply that 

|P[P e = P e , = l]-P[P e = l]| (153) 

becomes small as e' approaches e. To be more explicit, in Cabello's proof it is assumed that 
if (e n ) is a sequence that approaches a vector e that is orthonormal to /, then 

lim P[P e „ = P f = 1] = P[P e = P f = 1] = (154) 

n— >oo 

for allowed probability distributions. This is simply not true. In fact, the discontinuity of the 
pure states is an essential part of the MKC-models (see below). 

Cabello's argument (although implicitly) does point out a feature of the MKC-models 
that is not understood clearly yet, namely, the possible measurement of observables whose 
corresponding operators do not commute. It is no wonder that the models are silent about 
this topic, since it is also an unclear point in quantum theory itself. However, in the case 



considered in Lemma 3.7 it can easily be seen that the MKC-model does reproduce equation 



(148) because of Theorem 3.5 Indeed, in this case one has p = P ei and using Corollary 3.6 



one finds 

^ P [Pe[ = Pe' 2 = 1] = P P [P e ; = 1] PplPeJ = 1] = Tr(P ei P e , ) Tr (P ei P e , ) 
~Tr(P ei P e2 )Tr(P ei P ei ) = i 

So, not only does the proof of Cabello contain a flaw, the asserted lemma isn't true either. 
However, implicitly Cabello has shown that it is a necessary condition for the MKC-models 



that they violate (154). 



3.5.2 Non-Classicality 

The most extended criticism the MKC-models have attracted is due to Appleby [Applj , 
|App2|, |App3] and |App4| . From Appleby's investigation, three main objections against 
the MKC-models may be distinguished: 

(i) The MKC-models are not classical in some sense. 

(ii) The MKC-models are contextual in some sense. 

(iii) The MKC-models are not robust with regard to the finite precision of measurements. 

I chose to use the expression "in some sense", because in my opinion the validity of the first two 
claims is partly a matter of taste. That is, they depend on what one is supposed to expect from 
a hidden-variable theory. Appleby states that, although the MKC-models, strictly speaking, 



Further Criticism 



57 



i~ 7 | 

do nullify the Kochen-Specker Theorem they do not nullify the essential point made by the 
Kochen-Specker Theorem, which, according to Appleby, is that 

". . .quantum mechanics (whether relativistic or not) is inconsistent with classical 
notions of physical reality." |Appl[ p. 1] 

This is, of course, nothing new. It was already clear in the founding days of quantum mechanics 
that this new theory was inconsistent with classical notions due to the wave-particle duality 
and the superposition principle. So naturally, the question arises what is meant with these 
"classical notions" in this case. In |Appl| three criteria are introduced: 

1) "To each observable quantity characterizing a system there corresponds an objective 
physical quantity, which has a determinate value at every instant." 

2) "An ideal, perfectly precise measurement gives, with certainty, a value which exactly 
coincides with the value which the quantity being measured objectively did possess, 
immediately before the measurement process was initiated." 

3) "A non-ideal, approximate measurement gives, with high probability, a value which is close 
to the value which the quantity being measured objectively did possess, immediately 
before the measurement process was initiated." 

Criterion 1) is the only one on which the original Kochen-Specker Theorem focuses. The fact 
that the MKC-models do satisfy it brings the alleged nullification. It may be good to dispose 
of a possible misconception as to why this criterion is satisfied. For each finite-dimensional 
Hilbert space, the MKC-models provide a function that assigns to each self-adjoint operator 
in a dense subset of all the self-adjoint operators a value in its spectrum in such a way that 
the FUNC-rule is satisfied. By the finite precision of measurement, it is then argued that 
every time one intends to measure an observable, actually an observable corresponding to a 
self-adjoint operator in the colorable set is being measured. The other self-adjoint operators 
simply do not correspond to any observable at all. I regard this as one of the few nice features 
of these models, since it implies that there are only (at most) countably many observables. 

It may also be noted that the MKC-models are in fact models about infinitely precise 
measurements. Each of the observables in the theory can be measured with arbitrary precision 
and a measurement of an observable gives precisely the result it possesses. The only reason for 
introducing the notion of imprecise measurements is for arguing that the theory of quantum 
mechanics and the theory of MKC (both theories about precise measurements) are empirically 
indistinguishable. Theoretically, however, they are fundamentally different (as they must be 
according to the Kochen-Specker Theorem). In fact, if one were able to perform infinitely 
precise measurements, one would be able to distinguish the MKC-models from quantum theory 
simply by trying to measure one of the observables that are supposed to exist according to 
quantum theory, but which is non-existent according to the MKC-models. It follows that the 
MKC-models do in fact satisfy the criteria 1) and 2) (as is also acknowledged by Appleby). 

The rest of the article |Appl| is devoted to proving that the MKC-models cannot satisfy 
criterion 3). However, a similar conclusion can already be drawn from the previous section. 



Indeed, if it is held that 3) should imply ( 154 ), it follows from Lemma 3.7 that the MKC-models 



cannot satisfy 3) (this is also noted in |Ap p2 j and |App4|). 



5r In the later two articles |App2| and |App4| (the first version of |App3| was written before |App2| ) Appleby 
seems to doubt this earlier claim. I will return to this point in the next section. 
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In |App4| a more direct proof of the violation of 3) is given, motivated by looking at the 



coloring of Meyer (Proposition 3.1 ). The function / introduced there is densely discontinuous, 
i.e., every non-empty open subset of S^nQ 3 contains both elements that are assigned the value 
1 by /, as elements that are assigned the value 0. Appleby calls this feature "patholologically" 
discontinuous. It is taken to imply that any imprecise measurement of an observable P e (for 
some e S S^nQ 3 ) might just as well yield the value 1 — /(e) instead of the value /(e) (since an 
imprecise measurement of P e will result in revealing the value of some P e i with e' close to e). 
It is not a priori clear that the pure states in the MKC-models should also exhibit such densely 
discontinuous features. However, Appleby has proven that for any valuation function / on 
any dense subset of "P(C 3 ), there is a non-empty open set on which / is densely discontinuous. 
It therefore turns out that 3) is indeed necessarily violated by the MKC-models. 

Is criterion 3) actually necessary for a hidden-variable theory? I don't see why this should 



be the case, and neither do Barrett and Kent [BKJ. 58 Appleby also seems to have some 
doubts about whether or not 3) should be a necessary condition, since a great deal of the 
papers |App2| and |App4| are devoted to arguing that the MKC-models in fact violate a 
criterion that is stronger than 3), namely, that measurements in the MKC-models do not 
furnish any information about the system at hand: 

"MKC focus on the point that, in their models, a measurement does always re- 
veal the pre-existing value of something. [. . .] What they overlook is that, [. . . ] 
although the experimenter learns a value, s/he has no idea what it is a value of. 
Consequently, the experimenter does not acquire any actual knowledge." |App4| 
p. 4] 

This line of reasoning is of course true, but the conclusion drawn from it isn't. Although the 
experimenter doesn't know of which observable he/she has required the value, he/she does 
know approximately of which observable it is the value. Thus an (imprecise) measurement 
doesn't provide an imprecise result of the observable one intended to measure but, instead, 
provides a precise result of an imprecise observable. In my opinion, this counts as actual 
knowledge. Moreover, performing the same measurement on an ensemble of systems, an 
experimenter does acquire knowledge about the macro state of the system. In that setting, it 
is no longer of importance what the actual observable being measured is (by the continuity of 
the probability distributions). 
Appleby goes on to argue that 



'What emerges from this is that PMKCR have been asking the wrong question. 



The important question is not: "How much of S 2 [. . . ] can be coloured at all?" 
But rather: "How much of S 2 can be coloured in such a way that the colours are 
empirically knowable?" " |App4| p. 4] 

It seems to me that here it is Appleby who is asking the wrong question. By demanding that 
'the colors' should be empirically knowable, Appleby implicitly demands that the hidden vari- 
ables A of the MKC-models should be empirically knowable. That is, the hidden variables are 
not allowed to be hidden. From a realist point of view this misses the point. The investigation 



5S It should be noted that |BK| was written after |App4| , despite its earlier publication in the same journal. 

59 The P here stands for Pitowsky who was, in fact, the first to show that a densely defined valuation 
function on S 2 exists |Pit| . This model is non-constructive and relies on an unorthodox view on probability 
theory. However, the main reason for omitting it here, is that I didn't find that it contributed much to the 
story. 
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of hidden-variable theories is based on the question whether or not quantum mechanics could 
be completed. That is, the question is if it is in principle possible to assign definite values to 
all observables F^l 

In conclusion, I would say that it is a matter of taste whether one considers the MKC- 
models to be sufficiently classical or not and to what extent one requires a hidden-variable 
theory to be classical. The only hard fact is that the models violate Appleby's criterion 3). 
However, as it turns out, this is not the end of the story. 



3.5.3 An Empirical Discrepancy with Quantum Mechanics (Part II) 

In another attempt to prove that the MKC-models do not satisfy 3) (and that they are 
contextual in some sense) , Appleby (in |Appl| ) proved a more disturbing feature of the MKC- 
models: if one takes into account the finite precision of measurement correctly, the MKC- 
models empirically violate the predictions made by quantum mechanics. This is what is 
meant with the non- robustness of (iii). 

Recall that (as argued above) neither the MKC-models, nor quantum mechanics define 
theories that incorporate the finite precision of measurements explicitly. So one has to keep 
in mind that what Appleby in fact shows is that a certain modification of the MKC-models is 
incompatible with a certain modification of quantum theory. Indeed, a great deal of the paper 
|Appl| is devoted to presenting a modification of quantum theory that does allow one to 
compare measurement results of imprecisely measured (and therefore possibly incompatible) 
observables. Although of interesting nature, I will not discuss it here; the results of this 
modification that are necessary for this discussion will be presented along the way. 

The argument takes place in the setting of the spin-1 particle. Consider, again, the mea- 
surement of the squared spin along some axis r. Although only the operators a 2 E Vck 
correspond to actual observables, in an actual experiment one may of course also intend to 
measure ai, for some a 2 , ^ Vck (especially since it may not be known to the experimentalist 
what the set Vck exactly is). So which values of r' are acceptable? Appleby assumes that 
are acceptable as long as they are "finitely specifiable". One may think of 



computable real numbers. 61 It doesn't actually matter much which set is taken, as long as 
three criteria are satisfied: 

• The set contains all the r for which a 2 E Vck (let's denote this set S§ )■ 

• The set contains a subset {r%, . . . , r n } that is uncolorable. 

• The set is countable. 

Let S2 be a set that satisfies the above three criteria. An object a 2 , for r' E S^S^ may be 
called a pseudo-observable. 

Now, the intended measurement of a pseudo-observable a 2 , necessarily results in the ex- 
posure of the value of a 2 for some r G S% K ■ Appleby assumes that this direction r is selected 
by some probability measure fj, r > on the set S^ K ■ For given A, define 

p x {r') := /v{r E S? K ; \(<r 2 r ) = 1}, (156) 

60 Where it remains a topic of discussion what should be considered to be the set of all observables. 

ei Roughly stated, this is the set of numbers that can be estimated up to arbitrary precision using an 
algorithm on a Turing machine. It is a countable set that not only incorporates all rationals, but also numbers 
like y/2. 
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the probability of finding the value 1 if one intends to measure a^>, given that the particle is 
in the state A. Of course, such a measure \i r i may be assumed to exist for all r' G S 1 ^ (one 
may even take /v to be the Dirac measure for the set {r'} whenever r' G S§ )■ Then each 
state A induces a generalized state A : {af, ; r' G S 1 ^} — >■ {0, 1} by 

A(^):={°' iip f)<\> (157) 

It is not easy to give a clear interpretation of these generalized states. They don't contain 
any more information than that A(cr^,) gives the most probable outcome of the measurement 
of c^, given that the state is A. Any extra information given by the measure /v is thrown 
away. Also, note that in general for r G S% K one does not have A (of) = A(cr^). 

Since 5*^ is uncolorable, there exists a set of orthonormal directions ei,e2,e3 G S*^ such 
that 

A«) + % 2 2 ) + % e 2 3 )/2. (158) 

Next, consider an experiment in which one actually intends to measure these pseudo-observables 
and let Mi, M 2 ,M 3 denote the corresponding measurement results. In this setting, the fol- 
lowing lemma can be proven J^] 

Lemma 3.8. Consider a spin-1 particle in the state A and let a\ x , a^ 2 , <7g 3 be a triple of pseudo- 



observables such that ei,e2,e3 is an orthonormal basis and the relation (158) holds. Assume 
that the measures n, ei , /U e2 , fi e3 are independent. Then the probability that Mi + M 2 + M 3 7^ 2 
is greater than ^. 

Proof: 

Consider the event Mi+ M 2 + M 3 = 2. Since the measures // ei , /x e2 , fj, e3 are independent, the 
probability of this event can simply be calculated by using the product measure /U ei x /j, e2 x [i e . A 
on the space S^ K x S^ K x S§ K ■ One then has 

P [Mi + M 2 + M 3 = 2] = P [Mi = 0, M 2 = M 3 = 1] + P [M 2 = 0, Mi = M 3 = 1] 

+ P[^3 = 0,M 1 =M2 = 1] 
=(1 - p\(e 1 ))px{e 2 )px{e 3 ) + p x {e 1 )(l - p\(e 2 ))p x (e 3 ) 
+ P\{ei)p\(e 2 )(l -p\(e 3 )). 

For the sake of notational convenience, set P i = P \(ei) for i = 1, 2, 3. By symmetry, one may 
assume that p\ < p 2 < p 3 . Using relation ( 158[ ), the following cases can be distinguished. 



Either 1) pi > i, or 2) p 3 < ^ or 3) p 2 < ^ and p 3 > ^. For the remainder of the proof, recall 
that for any p G [0, 1], one has p(l — p) < j. 
Case 2) is easy. One has 

T[Mi + M 2 + M 3 = 2] =(1 - P i)p 2 p 3 -p 2 ) P3 + P ip 2 {l - P3 ) 

<^P2 +Pi(l -Ps)P3 +Pm{l -Ps) (160) 
111 1 

<7p2 +Pl^ < P2 < 2" 



2 In |Appl| this result is only stated and a proof is omited. 
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For case 1), set f{p2,Pz) = (1 — P2)Pz + P2O- ~Pz)- It follows that 
P [Mi + M 2 + M 3 = 2} = (1 -pi)p 2 P3 +Pi(l ~P2)P3 +PiP2(l -Ps) 

= (1 - pi) (p 2 P3 - f(P2,P3)) + fijP2,Pz) 

fl 1 

<SUp <^ - (p 2 P3 - f(p2,P3)) + f[P2,P3) I P2P3 ~ f(P2,P3) > 0,P2,P3 £ 1] 

V SUp { f(p2,P3) ! P2P3 - /(P2,P3) < 0,p 2 ,p 3 £ t, 1] 



2' 1J J (161) 

<SUp |^ (p 2 P3 + /(P2,P3)) 5 P2,P3 G 1] j V SUp |/(p 2 ,P3) I P2,P3 G 1] 
1 1 _ 1 

~ 2 V 2 ~ 2" 



Finally, for case 3) one has 



T[M 1 + M 2 + M 3 = 2] =(1 -pi)p 2 p 3 +Pl(l -P2)P3 +PlP2(l -P3) 

=P2 (Pl(l - P3) + (1 - 2pi)p3) +P1P3- 



(162) 



Note that both (1 — p$) > as (1 — 2pi) > so the factor for p 2 is always positive. Therefore 

(163) 



P [Mi + M 2 + M 3 = 2} <^ (pi(l - p 3 ) + (1 - 2pi)p 3 ) + pip 3 



1 / x 1 

= 2 (Pi +P3 -P1P3) < r- 
So for each of the three possible cases one has 

P [Mi + M 2 + M 3 f 2] = 1 - P [Mi + M 2 + M 3 = 2] > i (164) 

□ 



Appleby then concludes that if the finite precision of measurement is taken into account 
correctly (namely by introducing the measures p, r ), the MKC-models provide a non-negligible 
probability (> |) to find a measurement result that contradicts the FUNC-rule, no matter 
how precise the measurement becomes (i.e., no matter how much the measures fj, r approach 
the appropriate Dirac measures). On the other hand, Appleby's theory of imprecise measure- 
ments for quantum mechanics predicts that in that theory this probability goes to zero as 
measurements become more precise. He then reaches his conclusion: 

"This establishes that, if the alignment errors are random, and statistically in- 
dependent, then the model must exhibit a form of contextuality: for it means 
that the probable outcome of an approximate measurement must, in general, be 
strongly dependent, not only on the observable which is being measured, but also 
on the particular way in which the measurement is carried out. It follows that, if 
the stated assumptions are true, the model fails to satisfy clause 3). . . " |Appl| p. 
17] 
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Although Appleby focuses his objections on the statement that the MKC-models cannot 



satisfy clause 3), I think the direct consequence of Lemma 3.8 stating that the MKC-models 
make predictions that are empirically different from the predictions made by quantum theory, 
is the strongest objection that can be made against these models. It is indeed a bit strange 
that Appleby doesn't come back to this point extensively in any of his later articles. 

The reply of Barrett and Kent to this objection isn't very extensive either. They simply 
state that the assumption that imprecise measurements may lead to simultaneous measure- 
ments of observables that do not correspond to commuting observables is wrong. In their own 
words: 

"... in a CK model for projective measurements, the projectors actually measured 
are always commuting (assuming that they are measured simultaneously) -this is 
one of the axioms of the theory that relate its mathematical structure to the 
world. . . " (HH p. 163] 

This defense works only as long as the pseudo-observables cr^ , <7g 2 and cr^ are measured 
simultaneously, for then the entire procedure can be regarded as the measurement of a single 
pseudo-observable (which then results in the measurement of a single real observable). In this 



case, the measures [i ei , fi e2 , fi e3 in Lemma 3.8 are not independent. 



However, quantum mechanics predicts the same results irrespective of whether the pseudo- 
observables are measured simultaneously or sequentially. Appleby argues that in the case of 
sequential measurements, requiring that the measures [i ei , /i e2 , [i ez are dependent leads to 
complications. He thinks of a measurement where one selects three directions e 3 that 

are very close to being orthogonal but with the property that one is still able to measure that 
they are not. That is, he tries to trick nature by acting as if the precision of measurement 
is not as good as it actually is. Then, since the observables that are actually measured must 
commute, either it must be physically impossible to set up this experiment, or, when the 
measurement is performed, there is some force that actually changes the alignment of the 
set up (in a way noticeable for the experimenter). Either way, the model provides a definite 
empirical prediction (which is distinguishable from predictions made by quantum mechanics) . 

As they stand, the MKC-models do not provide an obvious way to escape these difficulties. 
It therefore seems that they are indeed not robust with regard to the finite precision of mea- 
surements, in a very drastic way. But, as noted earlier, the only thing proven is that a certain 
modification of the MKC-models is in conflict with predictions made by quantum mechanics. 
The question is left open of whether every modification should posses this property. I will 
show in the next section that this is not the case, by constructing an explicit counterexample. 

3.6 A Modification of the MKC-Models for Imprecise 
Measurements 

To examine the behavior of the MKC-models for sequential measurements, it is good to look 



at the way quantum mechanics deals with this. Considering the situation of Lemma 3.8 
after the first measurement of a\ , the von Neumann postulate implies that the state of the 
system changes instantaneously in such a (discontinuous) way that the measurement results 
will satisfy 7Wi+A^2+A^3 = 2 with high probability]^] The new state also makes the system 



63 Other mechanisms that preserve the empirical relations like M\ + M2 + M3 = 2 have been suggested to 
prevent the use of the von Neumann postulate. However, that is not important for this discussion. 
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robust with regard to the finite precision of measurements. It may be possible to introduce a 
similar postulate in the MKC-models. Barrett and Kent state that this is indeed the case: 

"If the projectors are measured sequentially, then the rules of the model stip- 
ulate that the hidden state changes discontinuously after the measurement and 
Appleby's analysis no longer applies." |BK| p. 163] 

This discontinuous change is supposed to take place in such a way 

"that the probability distribution of the post-measurement hidden variables corre- 
spond to that defined by the post-measurement quantum mechanical state vectors." 
[BKl p. 160] 

However, it is not clear how this discontinuous change should effect the pure state (of the 
hidden variable model) of a single system and if this can be done in such a way that Lemma 



3.8 can no longer be applied. Indeed, the following comment by Appleby still seems to apply: 



"The models discussed by MKC are incomplete, since they do not include a spec- 
ification of the dynamics. It is a highly non-trivial question, as to whether there 
exists a dynamics which, in every situation, gives rise to the probability distri- 
bution having the desired properties-not only in situations where the alignment 
errors arise "naturally", but also in situations where the errors are adjusted "by 
hand" (in the manner described in the last paragraph) [i.e., Appleby's procedure 
to trick nature]" |Appl, p. 17] 



The difficulty that arises when trying to consider a possible modification is that an im- 
mediate change of the pure state A is required to ensure robustness, whilst the pure states 
themselves have no direct relation with quantum mechanics. In line with Appleby, one may 
assume that for each (imprecise) measurement of the (pseudo-) observable P, there is a proba- 
bility measure [ip on Vck^H) that selects the actual observable of which the value is revealed 
upon measurement. To investigate what properties the immediate state change should obey, 
the following definition is introduced. 

Definition 3.2. For e > 0, an imprecise measurement of the pseudo-observable P is called 
e-precise if 

Vp{P' G Vck{U) ; ||P' - P\\ > e} < e. (165) 

Now consider the sequential measurement of two pseudo-observables Pi, P2 with H-P2-P1 II < 
5 for some 5 > 0. Suppose the first measurement of Pi yielded the result 1 and let A' denote 
the state of the system after the measurement. If all measurements are e-precise, A' is required 
to satisfy the following relations: 

f, Pl {P' eV CK (n); X'(P') = 1}^1, ase,<5^0; 
fiP 2 {P' G Vck(H) ; A'(P') = 1} -> 0, as e, 5 -»■ 0. 

Equivalently, if the measurement of Pi yielded the result 0: 

fi Pl {P' G Vck{H) ; A'(P') = 1} -> 0, as e, 5 -> 0. (167) 



The most natural way to establish this would be to require that after the measurement of Pi 
yielded the result 1, the state changes to a state A' that assigns the value 1 to all P G Pck(7~L) 
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in some neighborhood of Pi and the value to all P G Pck(T~L) that are in the neighborhood 
of some P' perpendicular to P\. Similarly if the measurement yielded the result 0. 

This seems an impossible task, since it was shown by Appleby that every A is densely 
discontinuous in a certain region D\. However, Appleby made no specific claim about the 
regions where A can be continuous. Indeed, Appleby did not (extensively) answer his own 
question about how much of S 2 can be colored in such a way that the colors are empirically 
knowable. It turns out that the MKC-models are sufficiently flexible to allow colorings that 
are empirically knowable. 

Proposition 3.9. For every finite- dimensional Hilbert space %, for every unit vector e G % 
with P e G Vck(W)) there are open subsets U, U' ofV(H) (with respect to the norm on B(W)) 
and a non-empty subset A e C A such that 

(i) P e eU and P e , G U' for all e' _L e. 

(ii) For all A G A e A(P) = 1 for all P G UHVckCH) and \(P) = for all P G U'C\Vck{U). 
Proof: 

Let e be given and define U and U' as follows: 



(168) 



U-.= lpeV(H); \\PP e \\ > \ 
U' := |p G V(H) ; \\PP e \\ < * J 



where n is the dimension of T~L. It is easy to see that these sets are openf^ 

Recall that each A is completely defined by its action on the one-dimensional projections 
(by Lemma 3.4). Now, define 

C f(m)=j whenever P (m) 6f7, "| 

A e := <j A/ G A ; /(m) ^ whenever p [ m) &j' ) ■ ( 169 ) 

It follows directly from this definition that A e satisfies criterion (ii). To show that it is not 
empty, the following two statements have to be proven for every orthonormal basis ei,...,e n 
olU: 

1) If P ei G U, then P ej ^ U for all j > 1 (i.e., at most one of the P ei lies in U). 

2) UP ei ,...,P en _ 1 eU', then P en ^U'. 

To prove 1) and 2), it is useful to note that for every projection P one has 



\PP e \\ = sup{||PP e V>|| ; ^ G H, W\\ = 1} = sup ^(PP e i;,PP e 4j) 

«>; 11^11=1} 

= sup v 7 {Peip,PP e ip) = sup v 7 (V ; 1 e)(e,Pe)(e,V') = ||Pe| 
{V; ||^||=l} ||^||=i} 



(170) 



64 Indeed, if P 6 U choose < e < \\PP e \\ - |. Then P' G U whenever \\P' - P\\ < e (for all P' £ V{U)): 

\\PPe\\ < ||P-P'||||P e || + ||P'Pe|| <e+\\P'P e \\, 

thus || P'P e || > || PP e | — e > j. A similar argument shows that U' is open. 
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Consequently, whenever P and P' are orthogonal projections, one has 

||(P + P')P e \\ = \\PP e \\ + ||P'P e ||. (171) 
Now let ei, . . . , e n be any orthonormal basis. Suppose P ei £ U. Then for all j > 1 

1 > \\(P ei +P ej )Pe\\ = \\P ei Pe\\ + \\P ej Pe\\ >\ + \\P^\\, (172) 

hence || P ej P e \\ < \. Thus P ej $ U. 

To prove the second statement, suppose P ei G t/' for all i = 1, . . . ,n — 1. Using (171), it 
follows that 

n— 1 n— 1 ^ ^ 

||P en P e || = || 1 P e || - || J^Pell = 1 " H Pe > Pe H > 1 " ( n " ^" = ( 173 ) 
i=l i=l 

This proves the proposition. □ 

Using this proposition, it may be argued that after the measurement of any (pseudo- 
observable P, the following procedure takes place: let A denote the state of the system before 
the measurement and let P' be the observable whose value was actually displayed by the 
measurement. If A(P') = 1, the state changes to some A' S A e for some unit vector e with 
P e < P' . If X(P') = 0, the state changes to some A' G A e for some unit vector e with P e < P /_L . 
This procedure ensures that the MKC-models are robust with regard to sequential imprecise 
measurements of almost commuting pseudo-observables. As a consequence, Appleby's Lemma 



3.8 no longer applies to this modification. 

The above-described procedure is a very drastic one to ensure robustness. For example, 



the sets U and U of Proposition 3.9 could have been taken a lot smaller. But there is also 
a more natural way to select a new A'. Consider a system in the macro state P p . If the 
measurement of the (pseudo-) observable P on an individual system yielded the result 1, a 
new state A' may be selected at random in accordance with the measure Pp p p. Equivalently, 
if the result was 0, a new state may be selected in accordance with the measure Pp± p p±. 

This scheme enables one to define the entire dynamics for the models. Indeed, one may 
consider that the actual state of an individual system X(t) changes rapidly and stochastically 
in time, precisely in such a way that for every t one has 

P[A(t)€A]=P„ (t) [A], VAC A, (174) 

where p(t) is given by the Schrodinger postulate (when no measurement is performed) and the 
von Neumann postulate (when a measurement is performed). More formally, the structure of 
the non-contextual hidden variable theory may be defined by the following postulates: 

1. State Postulate: For every physical system there is a set of pure states A, defined 



as the set of valuation functions on Vck(H) (which is well-defined by Theorem 3.3 



and Lemma 3.4) , where H denotes the finite-dimensional Hilbert space associated with 
the system in quantum mechanics. The evolution of the state is given by a function 
lo € Q = A K . At each time t the pure state ui(t) G A gives a complete description of the 
system. 
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2. Observable Postulate: With each physical observable A, there is associated a function 
A : A — )■ K that assigns to each state the value possessed by the observable A in that 
state. A measurement of the observable A at the time t yields the value A(uj(t)). 

3. Dynamics Postulate: The evolution of the state of a single system is described by a 
stochastic process (Xt)teR with filtration (J-t)teR on the space (f2,X,P) where 

X t (u):=u(t), (175) 

and Tt is the smallest cr-algebra generated by (Xf)t><t and S is the smallest cr-algebra 
containing Ut^m.J-'t- At each time t, the probability that A G A for any subset A C A is 
given by P[X t _1 (A)]. 

These three postulates indeed imply that the theory at hand is a stochastic non-contextual 
theory, which in principle admits a realist interpretation. The following postulates only serve 
to ensure empirical equivalence with quantum mechanics. 

4. Extended Observable Postulate: With each physical observable A, there is also as- 
sociated a self-adjoint operator A for which the spectral decomposition A = Y2aecr(A) a ^' a 
satisfies P a £ VcK^i) Va E cr(A). The relation between A and A is given by 



i(A)= ]T a\(P a ) 



(176) 



a£a(A) 



Extended Dynamics Postulate: The probability measure P on the space (0, S) 
satisfies 

(177) 



P[X t - 1 (A)]=P p(t) [A], VAc A, 



where the right-hand side is defined by Theorem 3.5 and pit) is the quantum-mechanical 
state of the system whose time-evolution is given by the Schrodinger and von Neumann 
postulates. 

This last axiom ensures that the probability of finding a value in A upon measurement of the 
observable A at the time t is given by 



P 



AT 1 (A-\A))} = P [X^ 1 (U aeA {A ; X(P a ) = 1})] 



P p (t) [UaeA{A ; A(P a ) = 1}] 
{A ; X(P a ) = 1}] 



aeA 



(178) 



Y,Tr(p(t)P a ) = Tr(p(t)p A (A)), 



aeA 



where equation ( 144 ) is used to arrive at the last line. As noted earlier, the continuity property 



of the probability measures ensures that an imprecise measurement of an observable A yields, 



with approximately the same probability, approximately the same result (see (146)). This 
is the sincerest form of robustness one may expect for a stochastic theory and in this sense, 



Appleby's objection (iii) (Section 3.5.2) no longer applies. 



As a final remark it may be noted that the theory defined by the above five postulates 
is in fact fundamentally indeterministic. Indeed, the actual probability measure P is only 
determined up to the time t since pit) is not determined for all t. That is, unless one argues 
that it is also actually determined which experiments will be performed by experimenters, the 
von Neumann postulate implies that p actually evolves indeterministically. 
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3.7 Non-Locality of the MKC-Models 

An interesting light may be shed on the MKC-Models when studying them in the context 



of the EPRB-experiment (Example 2.2) and the Bell inequality for this experiment (Section 
2.3.4). The original MKC-model is extremely non-local for this system: it turns out that 
almost every measurement on the first particle also interacts with the second particle and 
vice versa. This can be seen as follows. The Hilbert space associated with the pair of spin- 1 ; 
particles is C 4 . In quantum mechanics, an observable for the first particle corresponds with 
an operator of the form A (g) 1, where A is some self-adjoint operator acting on the space C 2 . 
But even if A corresponds to an observable in the MKC-model, A (g) 1 will most likely be a 
pseudo-observable. For suppose A ® 1 and 1 ®A' are two observables with A = a\P\ + a^Pit 
A' = a! x P[ + a! 2 P' 2 . Then 

P X ®P[, Pi®J^, P 2 (8)P 1 , P 2 ®P 2 (179) 

forms a resolution of the identity. Since each projection operator is only allowed to appear in 
one resolution of the identity, there can be no other observables of the form B 1 or 1 ®B' 
unless [A, B] = or [A', B'\ = 0. Consequently, a measurement of a (pseudo-)observable A®1 
will result in the measurement of some observable (A®!)' close to A 01, which is most likely 



not of the form A' (g) 1 (and probably not even of the form A' ® I'). 65 In general, it is not 
easy to find an interpretation of an observable that is not factorizable (like (A® 1)')) but it 
is generally agreed that it is a non-local observable, that is, a measurement of the observable 
requires an interaction with both p articles J^] 

Despite this extreme non-local property, it does not automatically follow that the MKC- 
model should actually violate the Bell inequality. This is because the EPRB-experiment is 
generally expected to involve two subsequent measurements (which procedure is not yet well- 
understood in the original MKC-models). In fact, if one would use an approach similar to the 
one used by Appleby in Section |3.5.3| it would follow that the MKC-model does satisfy the 
Bell inequality. That is, unless it is assumed that the spins of the two particles are measured 
simultaneously every time. In this case the Bell inequality is violated in a somewhat peculiar 
way. Each measurement reveals the value of some observable (a ri ®cr r2 )' in the neighborhood 
of a ri (g> oy 2 . But whereas in quantum mechanics a ri <g> cr r2 can be viewed as the product 
of two observables a ri 1 and l(g)cr r2 , it is in general not the case that (a ri <%> <J r2 )' can 
be seen as the product of two observables in the neighborhood of a ri ® 1 and 1 ®<7 r2 . This 
factorizability is indeed a necessary condition to derive a Bell inequality (see for example 



[Bub2j). Consequently, the estimates made in (86) and (87) no longer hold. In other words, 
the measurement of cr ri ® a r2 can no longer be viewed as measuring the two spins and then 
taking their product in the MKC-model. 

The modified MKC-models introduced in the previous section can violate the Bell inequal- 
ity by the discontinuous state change upon measurement. It is easy to see that this entitles 
a violation of OILOC (while CILOC is left intact). But although a new form of non-locality 
appears, one may argue that the extreme form of non-locality can be dropped in the modified 
models. 



65 The exact form of these observables depends, of course, on how one exactly constructs Vck(C 4 ). It is an 
open question if it can be constructed in such a way that it respects the structure of C 2 ® C 2 . 

66 A similar sort of reasoning can be found in [Xpp3 to argue for a certain form of contextuality. But even 
if the (in my opinion somewhat vague) argument presented there appeals to the reader, it may be disposed of 
by the modified MKC-models as I will indicate later. 
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The question is actually how a system consisting of several subsystems should be described 
in the context of the MKC-model. In a straightforward approach, for a system consisting of 
two subsystems, the set of states would be given by the set of all valuation functions on 
T ) ck{T~Li ®%2)) where T-L\ and H2 are the two Hilbert spaces associated with the subsystems 
in quantum mechanics. This approach leads to the extreme non-locality discussed above. 
However, instead of first trying to define the states, it is better to first look for a definition 
of appropriate observables. It seems reasonable to assume that any measurement on the 
composite system first requires an interaction with one subsystem and then with the other. For 
example, a measurement of A® B consists of a measurement of A on system 1, a measurement 
of B on system 2, and finally of taking the product of the results. Since quantum mechanics 
predicts that the result will be independent of the order in which A and B are measured, 
one can unambiguously speak of the measurement of A <g> B. It therefore seems reasonable 
that each observable for the composite system should be a function of observables for the 
individual systems. The set of observables obtained in this way will in general no longer be 
colorable. Consider for example the following set of observables for the pair of two spin-^ 
particles, which is uncolorabld^j 

a x ®t 1 ®a x a x ® a x 

1 ®(Ty Oy ® 1 (Ty ® O y 

a x ® (Jy Oy (g) o x a z <8> a z 

It is plausible that there exists an orthonormal basis x, y, z such that all these operators 
would actually correspond to observables in this theory. However, in the scheme illustrated 
above the only way to measure a y <8> cr y is by a measurement of a y (g) 1 and a measurement 
of 1 ®cr y . A measurement of the pair a y <8> cr y ,a x <8> cr x would actually involve measurements 
of all four of the 'local' observables that appear in the square. A disturbance due to the von 
Neumann postulate would therefore render this example meaningless. Indeed, for the pure 
states the functional relations (i.e., the FUNC rule) are only required to be satisfied locally, 
whereas the von Neumann postulate ensures that the relations are also satisfied for non-local 
measurements. 

The above discussion can be summarized as follows: for a composite system, consisting of 
two subsystems, the set of pure states is given by Ai x A2, where Aj is the set of all valuation 
functions on VcK^Hi)- The evolution of the state of the composite system (uj : K — >• Ai X A2) 
may now be viewed as described by two coupled stochastic processes guided by the density 
operators p\ and P2 that are derived from the density operator p on the space H\ (&H2 and 
the use of the partial trace, i.e. pi = r Yt2{p) where Tr2 : B(T~Li ®%2) — > B{T-Li) is the unique 
linear operator that satisfies 

Tr 2 (^i ® A 2 ) = Tr(A 2 )Ai, VA t £ B(H 1 ),A 2 £ B(H 2 ), (180) 

and similarely P2 = Tri(p). The procedure to cope with these partial traces becomes clearer 
in an example. 

67 This example is taken from |Mer3| . 
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Let A be an observable for the first subsystem and B an observable for the second system, 
and suppose one wants to measure their product AB. The self-adjoint operator associated 
with this observable is denoted by A (g> B, and the corresponding stochastic variable is given 
by 

AB : Ai x A 2 -> M, AB(\i, A 2 ) := A(X 1 )B(X 2 ). (181) 

Suppose a measurement of AB first entails an interaction with subsystem 2 and then with 
subsystem 1. Let t be the time at which the measurement was initiated and let t' be the time 
at which the interaction with subsystem 2 has taken place (i and t' may be assumed arbitrarily 
close to each other). The probability that the measurement yields the result x may then be 
calculated as follows: 



F[ABG{x}]= Y, P pi(*') 

b£a(B) 



A € {x/b}\B G {b} 



P 



P2(t) 



BG{b} 



, 5: l ^ {x/h) — Trd^wi^n) — Tr( ^ 1 0n) 
^ i ff(A) (x/6)*&(^)p x/6 ®n) 



(182) 



bea(B) 



E 



Tr (p(t)P a ® P b ) , 



(a,6)eo-( J 4)x CT (_B) 
afe=c 

which is exactly the probability predicted by quantum mechanics. It follows that this proba- 
bility is independent of whether the measurement apparatus first interacts with subsystem 1 
or with subsystem 2. The requirement that this interaction between the two subsystems (i.e. 
the conditionalizing of the probabilities) must take place no matter how close t' is to t, in fact 
leads to a violation of OILOC. This violation thus comes as a blessing for this theory since it 
enables one to avoid the extreme non-locality discussed at the beginning of this section. 



3.8 Conclusion 

In my opinion, the modified version of the MKC-models is a non-contextual hidden-variable 
theory. If it is taken to be the sole statement of the Kochen-Specker Theorem that theories 
that assign definite values to all observables in a non- contextual way (consistent with the 
algebraic relations holding in quantum mechanics) are impossible to construct, one might say 
that the theorem has indeed been 'nullified'. However, the term 'nullification' seems to imply 
that the entire theorem may be rendered useless. With this statement I disagree, and instead 
I tend to agree with Appleby that 

"The PMKC models do not nullify the Bell-KS theorem. Instead, they give us a 
deeper and more accurate insight into what the theorem is telling us." |App4| p. 
22] 

Appleby states that what the Kochen-Specker Theorem is telling us, is actually a deeper 
statement than the one that there are no non-contextual hidden- variable theories. He adopts 
Bell's view on the theorem that the main conclusion to be drawn is that 

". . .the result of the measurement does not actually tell us about some property 
previously possessed by the system. . . " [Bel3j 
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But, as argued before, I think this statement is too strong and does not apply to the MKC- 



models. Instead, the main statement may be taken that point 3) of Section 3.5.2 must be 
violated by any hidden-variable theory, and it is a matter of taste if one is willing to allow 
this. It may also be emphasized that the full scope of the Kochen-Specker Theorem is restored 
as soon as there appears to be a method to perform infinitely precise measurements (if only 
for a finite uncolorable set of observables). Although unlikely and unimaginable, this may be 
a possibility. 

But even if the Kochen-Specker Theorem happens to have been nullified, there are still 
theorems stating that every hidden-variable theory must be non-local. Indeed, the MKC- 
models turn out to be no exception (I will also come back to this point in the next Chapter). 
For me, this is enough to rule out the possibility of acceptable hidden-variable theories. 

For a closing comment, I concur with the one made by Barrett and Kent: 

"We would like to emphasize that neither the preceding discussion nor earlier con- 
tributions to this debate [. . . ] are or were intended to cast doubt on the essential 
importance and interest of the Kochen-Specker Theorem. As we have stressed 
throughout, our interest in examining the logical possibility of non-contextual hid- 
den variables simulating quantum mechanics is simply that it is a logical-if scien- 
tifically highly implausible-possibility, which demonstrates interesting limitations 
on what we can rigorously infer about fundamental physics." [BKJ p. 174] 
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4 The Free Will Theorem Stripped Down 

There is no effective scientific test for free will. You can't 
run the universe again, with everything exactly as it was, 
and see if a different choice can be made the second time 
round. 

- T. Pratchett, I. Stewart & J. Cohen 

4.1 Introduction 

In [CK2] and [CK4J Conway and Kochen present a very remarkable theorem. In their own 
words 

"[i]t asserts, roughly, that if indeed we humans have free will, then elementary 
particles already have their own small share of this valuable commodity." [CK4J 

This is a very strong and strange statement, and it seems unlikely that such a strong philo- 
sophical statement can be proven by means of mathematics alone (along with some physical 
axioms). While discussing the theorem, I'll try to lay bare where the philosophical assump- 
tions turn into mathematical ones and where the mathematical conclusions are transformed 
back to philosophical conclusions. The stronger the claim of a theorem, the more important 
such transformations become. But of course, it is easy to criticize such transformations. The 
simplest argument will be that philosophical notions simply have a different nature from math- 
ematical ones and therefore, any translation from the one into the other cannot be flawless 



(see also the discussion at the end of Section 2.3.4). But I will also try to give some critique 



while accepting these transformations, weakening the so-called "free will theorem" even for the 
believers. 

The theorem that will be discussed here is in fact the "The Strong Free Will Theorem" 
[CK4j, which seems a bit more transparent than the original Free Will Theorem [CK2J. I will 
try to explain the theorem in such a way that it becomes clear that it doesn't so much depend 
on either quantum mechanics or relativity. At best, the theorem can be understood by anyone 
with a decent knowledge of mathematics and philosophy, with the hope that also people from 
these disciplines can enter the discussion. This should certainly be the aim of any theorem 
that is supposed to make a statement about free will. 



4.2 The Axioms 

Three axioms are introduced named SPIN, TWIN and MIN. I will state them here in the form 
as they appear verbatim in [CK4J. 



SPIN: Measurements of the squared (components of) spin of a spin-1 particle in 
three orthogonal directions always give the answers 1,0,1 in some order. 



This axiom is of course derived from quantum theory. But this theory is not necessarily 
needed to make it an acceptable axiom. It is in fact an experimentally testable prediction. 
The term 'squared spin of a spin-1 particle along some axis' may be directly associated with 
some associated procedure to measure this observable. Such a procedure (or an appropriate 
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equivalence class thereof) may even be taken as the definition of the observable. 68 That is, 
once the axiom has been experimentally verifiecj^J it becomes an axiom that must be derivable 
from any theory that describes the associated experiment. 

From this point of view, the axiom can be explained in the following way: for each choice of 
three orthogonal directions ei, e2, &z, one can construct a measuring context that always gives 
three numbers Xi,x%,X3 with the special property that two of these numbers are equal to 1, 
and the other is equal to 0. It is also possible to unambiguously associate each of the obtained 
numbers Xi with exactly one of the chosen directions ej. The exact procedure involved is 
irrelevant for the argument made by the Free Will Theorem. 

Explained in this way, the SPIN axiom is a very modest one. There are, of course, many 
other procedures to obtain three numbers x±,X2, £3 with the special property that two of these 
numbers equal 1 and the other equals 0. One may, for example, put three balls in a bag, one 
of them blue and the other two red, then take out the balls one at the time and take a number 



for each ball; if the ball is blue and 1 if it is red 70 The reason for taking spin-1 particles 
only becomes clear in the second axiom. 

TWIN: For twinned spin-1 particles, suppose experimenter A performs a triple 
experiment of measuring the squared spin component of particle a in three or- 
thogonal directions x, y, z, while experimenter B measures the twinned particle 
b in one direction, w. Then if w happens to be in the same direction as one of 
x,y,z, experimenter B's measurement will necessarily yield the same answer as 
the corresponding measurement by A. 

The above formulation of this axiom by Conway and Kochen is a bit unfortunate. In fact, 
it is not an axiom at all, but rather a definition of the term "twinned spin-1 particles". The 
axiom should then be taken to be the statement that twinned spin-1 particles exist. Or, more 
precisely, that it is possible to set up two systems, conveniently named particles a and b, such 
that a measurement on system b necessarily yields the same value as one of the obtained values 
from a measurement on system a if certain criteria are met. My point becomes clearer in a 
story. 

The TWIN axiom states that two experimenters A (Alice) and B (Bob) can come together 
to set up part of an experiment. Then they split the experiment into two and each of them 
takes one part to his/her own home laboratory. Once at home, each of the experimenters can 
choose to perform a measurement on their part of the experiment at any time they like. In 
advance, Alice and Bob have agreed on two sets of possible measurements from which they 
are allowed to pick one. The experiments Bob is allowed to perform are parameterized by a 
three-dimensional direction w. Suppose that he is allowed to choose one of the 33 directions 
from the Peres proof of the Kochen-Specker Theorem (see table |2J). Each of the possible 
experiments has and 1 as the only possible outcomes i.e., with each of the 33 directions 



68 The thesis that observables only have meaning with reference to the way they are measured is common 
in Copenhagen interpretations of quantum mechanics. See for example |Heil Ch. III]. 

69 This can of course only be done up to certain precision; I will come back to this point in Section 



4.4.3 



It may also be noted that the known elementary spin-1 particles (the W or Z boson) aren't the easiest 
ones to manipulate for experimental testing. However, properties similar to the ones assumed in SPIN may be 
accomplished by working with coupled spin-| particles. For example, something similar was done in |HLLB| . 
That is, one doesn't necessarily need elementary spin-1 particles to test SPIN. 

70 Of course, the statistical behavior of spin-1 particles is much richer, as predicted by quantum mechanics. 
But it is not demanded here that quantum mechanics is true. 



The Theorem 



73 



there is associated a physical quantity with possible values and 1. The experiments can 
be thought of as the ones appearing in the SPIN axiom, with the extra procedure that two 
of the three results are thrown away. Alice, on the other hand, is allowed to choose from 40 
experiments, each of them associated with one of the triads that appear in the Peres proof 
(table [3| and corresponding to the experiment from the SPIN axiom. 

Thus far, this story is merely a consequence of the SPIN axiom. The new assumption is 
that when Alice and Bob are still together, they can arrange their experiment in such a way 
that if the direction Bob chooses at home coincides with one of the directions in the triad that 
Alice chooses, then their associated measurement results will also coincide no matter at what 
time either of the experimenters chooses to do their experiment. In short, they can prepare a 
pair of twinned spin-1 particles. Again, this is an experimentally testable axiom; the fact that 
quantum mechanics predicts this behavior is remarkable, though, of secondary importance. 

The third axiom actually incorporates two different axioms. One of these declares the free 
will of the experimenters and the other states a certain notion of locality. 

MIN: Assume that the experiments performed by A and B are space-like sep- 
arated. Then experimenter B can freely choose any one of the 33 particular 
directions w, and o's response is independent of this choice. Similarly and in- 
dependently, A can freely choose any one of the 40 triples x, y, z, and fc's response 
is independent of that choice. 

The axiom firstly states that both Alice and Bob have a form a free will that allows them 
the ability of free choice. In particular, they both have the ability to choose whether or not 
they let their choice depend on the choice of the other. Their choices are free in the sense that 
their history (up to the point of the experiment) does not determine the settings. 

Secondly, the notion of locality used in the axiom, is that the choice of the experiment of 
any experimenter does not influence the outcome of the experiment performed by the other. 
Since the experimenters themselves choose the time at which they perform their experiment, 
it may be arranged that neither does Alice's experiment lie in the causal future of Bob, nor 
does Bob's experiment lie in the causal future of Alice. This is possible under the assumption 
that instantaneous influences are prohibited and that there is no absolute frame of reference. 

4.3 The Theorem 

Conway and Kochen stated their Strong Free Will Theorem as follows |CK4j: 

Theorem 4.1. The axioms SPIN, TWIN and MIN imply that the response of a spin 1 particle 
to a triple experiment is free - that is to say, is not a function of properties of that part of the 
universe that is earlier than this response with respect to any given inertial frame. 

Obviously, the second part of the statement makes the first part even more mystifying. 
Being mathematicians, Conway and Kochen focus their proof of the theorem primarily on 
mathematical considerations. As a result, the philosophical considerations have become ob- 
scured. In an attempt to give both sides of the story the attention they deserve, I'll first 
present the main mathematical reasoning in a lemma. 

Lemma 4.2. It is not possible to simultaneously assign definite values to the outcomes of all 
possible experiments on twinned spin-1 particles, without violating the SPIN or TWIN axiom. 
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Proof: 

The assignment of 'definite values to the outcomes of all possible experiments' means that 
with each experiment, one associates a unique number (or set of numbers) that denotes the 
outcome of the experiment. Let Eb denote the set of the 33 possible experiments for Bob and 
let Ej\ denote the set of the 40 possible experiments for Alice. A definite value assignment that 
meets the SPIN axiom then consists of two parts, namely a function 9b ■ Eb —> {0, 1} and a 
function 9 A : E A -> {(0, 1, 1), (1, 0, 1), (1, 1, 0)}. At this point there are still 2 33 • 3 40 ~ 10 29 
possible assignments. 

For any triad (x\, X2, x 3 ) G Ea, let 9Jx\, X2, x 3 ) denote the j-th component of the triplet 
9a{x\, X2, x 3 ) for j = 1, 2, 3. For any fixed 9b, the TWIN axiom implies that for every x G Eb 
the following condition should hold: if x also appears as the j-th component in the triad 
(xi,x 2 ,x 3 ) G Ea, then 

9 B {x)=e A {x 1 ,x 2 ,x 3 ). (183) 

Note that Eb and Ea have been constructed in such a way that if x appears both as the j-th 
component in (x\, X2,x 3 ) G Ea and as the &;-th component in (x'i,x' 2 ,x 3 ) G Ea, then x G Eb- 
Consequently, if x appears both as the j-th. component in (xi,X2,x 3 ) G Ea and as the k-th 
component in (2^,2^,2^5) G Ea, then 

9 A (x 1 ,x 2 ,x 3 ) = 9 k A(x' 1 ,x' 2 ,x , 3 ). (184) 

This allows one to construct a function 9 : Eb —> {0, 1} in the following way: 

9{x) := 9 A {x\,X2,x 3 ) if x is the j-th component in the triad {x\,X2,x 3 ). (185) 

Since each element x G Eb appears in at least one triad in Ea, this function is actually 



defined for all x G Eb, and because of (184) it is also well-defined, i.e., the value assignment 
is independent of the choice of the triad in which x appears. Furthermore, the function 9 
satisfies the special property that for each orthogonal triple x,y,z G Eb one has 

9(x) + 9(y) + 9(z) = 2. (186) 



But according to Lemma 2.12, such a function does not exist. Since each choice of 9b together 
with the TWIN axiom leads to the construction of an impossible function, the assignment of 
a definite value to the outcomes of all possible experiments is not possible without violating 
either the SPIN or the TWIN axiom (or both). □ 



Proof of Theorem 4-1 



The structure of the proof is as follows. First, it will be shown that the assumption of MIN, 
SPIN and TWIN together with the assumption that the responses of spin-1 particles to a 
triple experiment is not free implies the existence of an assignment of definite values to the 



outcomes of all possible experiments. It then follows from Lemma 4.2 that the axioms SPIN 
and TWIN cannot both hold. Since all three the axioms SPIN, TWIN and MIN are assumed 
to hold, a contradiction is established, leading to the conclusion that the response of a spin-1 
particle to a triple experiment cannot be free. It then follows that its response must indeed 
be free. 

From a logical point of view, Conway and Kochen avoid (possibly improper) use of the 
double negation elimination by giving free will a negative definition: 
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Definition 4.1. The response of the particle is said to be free if it is not determined by what 
has happened at earlier times (in any inertial frame). 



This is an adaption of Conway and Kochen's defintion of the free choice of the experi- 
menter: 



"To say that A's choice of x, y, z is free means more precisely that it is not deter- 
mined by (i.e., it is not a function of) what has happened at earlier times (in any 
inertial frame)." |CK4| 

There is no mentioning in their articles of how their definition exactly corresponds to the 
definition of the free response of the particles, but I think Definition |4.1| is the most obvious 
one. 

Now first consider particle b. Since the response of the particle is assumed to be determin- 
istic, a definite value can be assigned to the outcome of the measurement that is actually being 
performed by BobJ^] Since Bob (by the MIN axiom) has a free choice which experiment he 
wishes to perform (i.e. the actual measurement he will perform is not determined), the particle 
must have a deterministic response for each possible choice of Bob. Thus at any time point ts 
(defined in the reference frame of Bob) there is a definite value assig nment ffg : E b ->■ {0, 1}, 
where 9^ (x) denotes the pre-determined measurement result if Bob chooses to perform the 
measurement x at time ts (here, the SPIN axiom has been used). Using the same sort of 
reasoning, at any time point tj\ (defined in the reference frame of Alice) there is a definite 
value assignment #V : Ea —> {(0, 1, 1), (1, 0, 1), (1, 1, 0)}, where 9^( i Xi,X2,x$) denotes the 
pre-determined measurement result if Alice chooses to perform the measurement (xi,X2,Xs) 
at time t^- 

In principle, the TWIN axiom only has to be satisfied for the actual pair of experiments 
performed by Alice and Bob. If ts < tA in all inertial frames 72 9^ can be arranged to satisfy 



the TWIN axiom for the specific outcome 9 1 ^ (x) for the specific choice of x made by Bob. Vice 
versa, 9 l g can be arranged to satisfy the TWIN axiom for the specific outcome 6f(x\, Xi, #3) 
for the specific choice of (xi, X2, X3) made by Alice if tj\ < ts in all inertial frames. 

However, t& and ts may be chosen in such a way that neither 9^ is allowed to influence 
6g, nor is allowed to influence 9^, that is, they are independent in the sense of the MIN 
axiom. Therefore, since all possible choices for experiments are allowed for both Alice and 
Bob, the TWIN axiom must actually be satisfied for all these possible choices. But this is 
impossible according to Lemma|4.2| 



It then follows that if one maintains the axioms SPIN, TWIN and MIN, the responses of the 
particles a and b to a triple experiment are not both determined. By De Morgan's law, it then 



follows that at least one of the particles must have a 'free will' in the sense of Definition 4.1 □ 



71 Here, one may raise the question whether the (mathematical) notion of value definiteness is a necessary 
condition for the (philosophical) concept of determinism to hold. I think it is. 

72 This is a fancy way of saying that events happening at the time is are allowed to have a causal influence 
on the events happening at time tA- 
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4.4 Discussion 

4.4.1 Free Will and Determinism 

Most readers will find the Free Will Theorem disturbing when they encounter it the first 
time. Indeed, most people have a certain notion of free will that is unlikely to be ascribable 
to particles. This notion will most likely also divert from the one given by Conway and 



Kochen (quoted just below Definition 4.1 ), because having free will is usually seen as a positive 
statement; it is a property someone may possess. Conway and Kochen however, define free 
will as a negative statement (and not even as a property): it is defined as the absence of some 
property that someone's actions may have. The reason why they use this definition is rather 
strange. In their own words: 

"Our provocative ascription of free will to elementary particles is deliberate, since 
our freedom asserts that if experimenters have a certain freedom, then particles 
must have exactly the same kind of freedom. Indeed, it is natural to suppose that 
this latter freedom is the ultimate explanation of our own." [CK4J 

I think it's questionable to state that this sort of freedom explains what free will is, because the 
assumption that they actually use in the proof of their theorem (under the guise of free will) 
is not that experimenters have free will, but that some of their choices are not pre-determined. 
Their definition also allows other forms of free will that are not explored in their paper. 

It seems natural that the freedom of choice possessed by the experimenter entails a certain 
form of indeterminism in the world, i.e. it is a common notion that some of our actual choices 
are not determined. However, the reasoning the other way around does seem to be a problem. 
Typically, our freedom of choice is accompanied by the feeling of being in control of the 
situation. Making a choice is an actual act, whereas indeterminism can be entirely passive. 
The notion of free will to which Conway and Kochen appeal in the MIN axiom is appealing, 
not because the choices made by Alice and Bob are not pre-determined, as they claim, but, in 
my opinion, because they can actually make these choices J^] It seems unreasonable to me to 
assume that the indeterministic behavior of the particles is similarly due to an active 'choice' 
made by these particles. 

But even within the context of the free will theorem, one may argue that the freedom 
of the particles cannot be "exactly the same kind of freedom" that experimenters have. It 
is clearly assumed by Conway and Kochen that Alice and Bob have an independent form of 
free will. However, the particles in the story certainly do not. Once one particle has decided 
what value it wishes to bring about at the performance of a measurement, the other particle is 
immediately robbed of its free will. In fact, adopting the term 'free will' to the behavior of the 
particles seems to introduce a new form of non-locality into the story, since the particles have 
to agree on what choice they make, using communicating skills that violate the law of causality. 
Secondly, it is plausible that the freedom to choose which experiment to perform is not the 
only freedom experimenters have. Besides being experimenters, most experimenters are also 
known to be normal human beings who sometimes choose not to perform a measurement and 



rather have a cup of coffee ' This is also a freedom particles do not have. If the experimenter 



73 Indeed, if the choices made by the experimenters were only due to some stochastic process, there would 
be no motivation to assign definite values to the outcomes of all possible experiments, since in that case one 
may argue that the stochastic process that selects the experiment also selects the outcome of the experiment 
(and thus doesn't have to select outcomes for all possible experiments). 

74 Some may even consider throwing it across the room [Conl| . 
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chooses to perform a measurement, the particle is forced to produce a measurement result. It 
seems that experimenters have an active form of free will, whereas particles (if indeed they 
have some sort of free will) merely have a passive form of free will. 

Thus far I have only argued that the term 'free will' is an unfortunate one to describe the 
peculiar behavior of the particles in the Free Will Theorem. It seems more natural to describe 
this behavior as indeterministic, which is usually defined as the denial of determinism. One 
often finds the following description of determinism by Laplace: 

"We ought then to regard the present state of the universe as the effect of its ante- 
rior state and as the cause of the one which is to follow. Given for one instant an 
intelligence which could comprehend all the forces by which nature is animated and 
the respective situation of the beings who compose it-an intelligence sufficiently 
vast to submit these data to analysis-it would embrace in the same formula the 
movements of the greatest bodies of the universe and those of the lightest atom; 
for it, nothing would be uncertain and the future, as the past, would be present 
to its eyes." |Lap| 

More loosely put, given a complete account of the current situation of the entire universe, 
there is only one possible future. 

But one does not need the Free Will Theorem to recognize that this form of determinism 
is excluded if experimenters have free will. One only needs to think of the following insipid 
example. If I hold a ball, I can choose to do with it whatever I want, making the path of the 
ball indeterministic. In more general terms, if there are indeterministic events present in a 
theory that have a causal influence on other events (which seems a necessary consequence of 
the form of free will of experimenters introduced by Conway and Kochen), these other events 
will necessarily also have indeterministic aspects. 

In Newtonian mechanics, the situation may be considered to be even worse. Here, one does 
not even have to assume the free will of experimenters to encounter a form of indeterminism 
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Consider a particle with mass 1 in one dimension. Supposing it is subjected to the force 
F{x,t) = 6sgn(x)|x|3 with initial position x(0) = and initial momentum p(0) = 0, the 
equation of motion 



d 2 x _ , j 



, ,, - 6sgn(x)|d 3 (187) 

\rjfi I 

does not have a unique solution, i.e. the motion of the particle is not determined. The 



example is even robust against a more general definition of determinism that allows that an 
entire account of the present state alone may not be enough to determine the future: 

"Determinism is the thesis that the past and the laws of nature together determine, 
at every moment a unique future." [vE] 

For suppose it is known that x(t) = and p(t) = for all t < 0, then still x(t) = i 3 , x(t) = — i 3 
and x(t) = for t > are possible solutions. 

The definition of determinism given by van Inwagen has my preference over the definition 
by Laplace. It has always puzzled me why determinism is often presented with the extra 

75 The following example is taken from (Lan3 , which is in turn inspired by [Ear]. 

76 Possible solutions are for example x(t) — t 3 or x(t) — —t 3 . It should be noted that this indeterminacy 
can only occur when the force does not satisfy the Lipschitz condition. Sometimes this condition is added as 
an axiom for Newtonian mechanics to ensure that the equation of motion always has a unique solution. 
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condition that the present renders the past as irrelevant for determining future events; which 
may perhaps conveniently be named the "deterministic Markov property"^ 

The above examples point out that value definiteness for all observables (a property of 
classical physics) does not necessarily imply determinism. Indeed, determinism does not only 
require that certain observables have a definite value, but also that the way these values change 
in time is uniquely given. If the Free Will Theorem is not to be rendered trivial, it must be 
shown that there is a crucial difference between indeterminism with value definiteness and 
indeterminism without value definiteness (and not only from a realist point of view). 

I don't believe it was the main intention of Conway and Kochen to show that determinism 
and free will are incompatible. Presumably, they did assume some form of compatibilism that 
is possible in classical physics, but is necessarily violated by quantum mechanics. Although 



I am by no means an expert on the philosophical problems involvec 78 I will try to give a 
definition of determinism that might appeal to Conway and Kochen 
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Definition 4.2. The response of a particle to an experiment is called locally deterministic if 
it is completely determined by its past, given the choice of the actual experiment performed, 
for all possible choices of the experiment. 

In a certain sense this is a milder form of determinism than the ones given earlier, since it 
only requires that the immediate future is determined rather than the entire future. This is 
what the term 'locally' in the definition expresses; it refers to locality in time, rather than in 
space. Indeed, it is a form of determinism that is obeyed in the classical examples just given 



<s0 



but it also seems that it is exactly a violation of this form of determinism that appears in the 
proof of the Free Will Theorem. A possible objection against this definition of determinism is 
that it's also a more vague one, since it relies on the notion of 'experiment'. However, the use 
of this notion in the definition seems necessary if Conway and Kochen's Definition 4.1 is to 
make any sense, since that in turn relies on the notion of 'response'. At least, the definition is 
useful in the present discussion and it can be used to propose a new formulation of the Free 
Will Theorem that is sharper and emphasizes what is really proven: 

Theorem 4.3. The axioms SPIN, TWIN and MIN imply that the response of a spin-1 particle 
to a triple experiment is not locally deterministic. 

Or, even more precisely^"] 

77 The term may be even more appropriately related to this discussion than it seems at first sight. In 
BLNJ it is argued that Markov came up with the idea of Markov chains partly to show that it is not true 
that independence is a necessary condition for the law of large numbers to hold. This erroneous assumption 
was used by Nekrasov to argue in favor of Christianity and the existence of free will, which was a source of 
annoyance for Markov (being an atheist). 

78 Ell provides a clear presentation of the problem with free will in philosophy. Roughly stated, the problem 
is that free will is incompatible both with determinism and indeterminism, but nonetheless seems to exist. 

79 I do not dare to claim that this definition makes free will compatible with determinism, but I do think 
that if compatibility is possible, this may be a possible step in the right direction. It is at least an attempt to 
alter our view on determinism instead of our view on free will, which seems the more customary approach to 
compatibilism. 

80 More generally, it is obeyed in any theory in which observables have definite values at all times that 
change continuously in time. In that case, the value assigned to some observable A at time t is given by the 
limit fimt'_>i A(t') which is completely determined by what happened at earlier times. 

81 This last formulation avoids the use of the de Morgan law at the end of the proof of the Free Will 
Theorem, which is not generally accepted (e.g. in intuitionistic logic). Indeed, what is actually proven is that 
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Theorem 4.4. The axioms SPIN, TWIN and MIN imply that the response of a spin-1 particle 
to a triple experiment cannot be locally deterministic for all particles. 

Regardless if one accepts the radical definition of free will for particles introduced by 
Conway and Kochen, I think the new formulation gives a more precise formulation of what is 
really stated by the theorem (or rather, by its proof). 

Overall, I think it is a pity that Conway and Kochen presented their theorem in the way 



they did. If it had been formulated in the proposed way of Theorems 4.3 or 4.4 probably 
more people would appreciate what is really accomplished by the theorem. Of course, the 
extension of free will to elementary particles is a witty notion, but I was surprised to see that 
even in |CK3j Conway states that the Free Will Theorem says that "if we human beings do 
indeed have free will, then so do elementary particles have their own very small quantity of 
free will." Einstein's valuable lesson that "a good joke should not be repeated too often" seems 
to apply here. 

4.4.2 The Possibility of Absolute Determinism 



In my reformulation of the Free Will Theorem (Theorems 4.3 and 4.4), the whole term 'free 
will' has been eliminated. However, the term 'Free Will Theorem' of course still applies, 
since it assumes the MIN axiom. For a true determinist (i.e. one who believes in a form of 
determinism a la van Inwagen or Laplace), the notion of free will as imposed by the MIN 
axiom is, obviously, appalling. Indeed, if one accepts this form of determinism, the choices 
made by Alice and Bob are of course also pre-determined, i.e., they don't have an actual 
choice. It is a shame that there do not seem to be a lot of philosophers who are determinists 
that mingle in the discussion. Fortunately, 't Hooft is both a respectable physicist and an 
outspoken determinist. In |tH2| he argues in favor of determinism and against the notions of 
free will proposed in the articles on determinism and quantum mechanics. In fact, he easily 
disposes of the motivations for free will. About Tumulka's argument that 

"[w]e should require a physical theory to be non-conspirational, which means here 
that it can cope with arbitrary choices of the experimenters, as if they had free 
will [...]. A theory seems unsatisfactory if somehow the initial conditions of the 
universe are so contrived that EPR pairs [e.g. the correlated pairs of particles 
which are postulate by the TWIN axiom] always know in advance which magnetic 
fields the experimenters will choose." |Tum2j . 

't Hooft states that this form of conspirational aspects, like any conspirational aspects, are 
difficult to object to from a deterministic point of view. Indeed, the feeling of a conspiracy 
seems unavoidable if one accepts determinism, since in that case one may always answer the 
question why things happen in a certain way by saying that it was simply determined to be 
that wayp^l 

Besides disposing of the arguments in favor of free will, 't Hooft also argues against the 
notion of free will "meaning a modification of our actions without corresponding modifications 
of our past" |tH2| . From a deterministic point of view this is obvious, since every possible 



the assumption that the responses of both particles are locally deterministic leads to a contradiction. So, for at 
least one of the particles the assumption isn't true. However, the proof doesn't provide a way for us to decide 
for which particle this is the case. 

82 One may, for instance, think that it is conspirational that the sun comes up every day. From a determinist 
point of view, what seems conspirational or not is just a matter of opinion. 
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past implies a unique present, a modification of the present must imply a modification of the 
past. Remarkably, 't Hooft also states that this sort of free will is already prohibited in the 
structure of quantum mechanics: 

"Suppose we let an [annihilation] operator a,i(x,t) act on a state, which means, 
more or less, that we remove a particle at the point x, at time t. A different state 
is then obtained, in which both the future and the past development of operators 
look different from what they were in the old stater |tH2] 

That is, any modification made in the present influences the past also in the structure of 
quantum mechanics. Or, put more strongly, according to 't Hooft quantum mechanics itself 
is in conflict with the MIN axiom. This is a dubious argument in my opinion. In the axioms 
of orthodox quantum mechanics, the role of the observer is described explicitly and the only 
alteration to the wave function that, according to these axioms, can be made by observers 
is its collapse. The act of annihilating particles is one that is not directly related to the 
axioms, and the question if the mathematical description of this act (about which 't Hooft 
speaks) has a direct relation to performing the act in real life (I do not know what this 
means exactly) is not easy to answer. But, to turn things around, if acts performed in the 
present by means of free will (e.g. turning on some magnetic field) do retroactively influence 
the probabilities of finding certain measurement results in the past, it would be possible to 
perform measurements that conflict with these probabilities in the present, since the wrong 
wave function would be used to describe the experiment (because one does not know if it will 
be altered by some act performed in the future). I am not sure if I am meeting the objections 
made by 't Hooft, since his arguments are often clouded by the use of mathematical symbols 
and technical terminology. But hopefully, I have given some hint about why I think that 
quantum mechanics is not incompatible with free will in principle, just like classical mechanics 
doesn't seem incompatible with free will in principle. 

Despite 't Hooft's determinism, he doesn't entirely reject the idea of free will. Instead, he 
proposes a modified axiom of free will. To understand this modification, one first needs to 
understand what is expected from a deterministic theory. In his own words: 

"All we should demand is that the model in question obeys the most rigid require- 
ments of internal logic. Our model should consist of a complete description of its 
physical variables, the values they can take, and the laws they obey while evolving. 
The notion of time has to be introduced if only to distinguish cause from effect: 
cause must always precede effect. If we would not have such a notion of time, we 
would not know in which order the 'laws of nature' that we might have postulated, 
should be applied." |tH2] 

The notion of completeness used here is presumably the same as the one required by 
Einstein, Podolsky and Rosen. Being a deterministic model, it predicts what will happen in 
the future with certainty. However, exact calculations are assumed to be of such a complex 
nature that "one will be forced to make crude approximations." On the other hand, to explain 
the present situation, "numerous guesses concerning the past" have to be made. This is where 
the free will of 't Hooft enters the discussion; that, although the model is about only one 
possible development of events, it should also describe other possible developments. In his 
own words, the modified axiom of free will states that 

"we must demand that our model gives credible scenarios for a universe for any 
choice of the initial conditions]" [tH2j 
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Thus, although one is not allowed to freely choose the experiment to perform, one is allowed to 
choose several initial conditions to check the theory. In particular, the theory should be defined 
and make predictions for any such choice. Of course, this choice is not entirely free since an 
actual free performance of calculations using pen and paper will effect the entire past. In fact, 
one is only allowed to check the theory for the initial conditions one is determined to check by 
this same theory. In other words, there is no real notion of 'any choice of the initial conditions' 
in the deterministic world. Another possible objection to this axiom is that it presumes the 
notion of 'initial state of the universe', which, in general, cannot be given a sensible meaning 
but perhaps only in the supposed deterministic model. Although I don't think that 't Hooft 
succeeds in showing that a sense of free will is still possible in a deterministic world, he does 



make an important point 83 absolute determinism is a logical possibility. 

But 't Hooft claims more than just the logical possibility of determinism; he also considers 
it possible that a deterministic theory of the universe might actually exist. I have reasonable 
doubts about this claim. Since such a theory would predict every action performed, a way to 
falsify it would be to find some of the experiments Alice and Bob are prohibited to perform 
and let them try to perform these experiments. This implies that the deterministic theory 
would actually make people aware of choices they are prohibited to make, which seems to 
me a very dubious situation (but indeed a logical possibility). A possible way out is that it 
is predetermined that such calculations (to determine prohibited experiments) will never be 
performed, but this seems unlikely because we've already come this far to discuss plans of 
performing these calculations (i.e., it will lead to a similarly dubious form of self-awareness). 
The other option is that the calculations are simply too difficult to perform. In fact, the 
laws of nature must be of such a character that there doesn't exist any numerical method to 
calculate the directions of choice for Alice and Bob even within some reasonable precision. 
This appears to be the situation as 't Hooft envisages it. It therefore seems that 't Hooft has 
taken it to be his task to construct a theory that nobody understands and nobody can use. I'd 
like to note that this remark is not intended as a sneer but rather as a warning. I do believe 
that research on a possible deterministic theory is a noble cause, but I just don't see how it 
can succeed without ending in vague ambiguous technical terms. Hopefully, determinists will 
be aware of these lurking dangers. 

Finally, it should be noted that the Free Will Theorem in this case does establish that a 
potential determinist theory cannot possibly improve on the predictions of quantum mechanics 
for the experiments concerning spin-1 particles. The choice between free will and determinism 
therefore doesn't affect the way these experiments are described. But at least quantum theory 
takes the notion of experiment seriously, whereas from the deterministic point of view I can't 
help feeling a sense of defeat if possible experiments that would falsify the theory are prohibited 
even if these experiments can be performed in principle (i.e., they are not explicitly prohibited 
by the theory itself). 

4.4.3 Robustness 

The free will theorem may be seen as a modification of the Kochen-Specker Theorem. Both 
theorems use an argumentation of the form: 

'A certain type of theory should have certain properties. If these properties are 
satisfied, then one can construct a certain function on the unit sphere S 2 in M 3 . 



3 Although it is by no means a new point. 
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Since such a function does not exist, a theory of such a type cannot exist.' 

As seen in Chapter [3j the Kochen-Specker Theorem is susceptible to loopholes if one takes 
the finite precision of measurement into account. Basically, the argument of Meyer |Mey 



was 



that for that 'certain type of theory' the 'certain function' doesn't need to be defined on the 
entire set S 2 . Since both the Kochen-Specker Theorem and the Free Will Theorem are based 
on the non-existence of the same function, it seems likely that Meyer's argument can also be 



used in this case. Although Conway and Kochen provide some discussion on the robustness ° 
of their theorem |CK2] . they do not explicitly go into the argument of Meyer (in fact, they 
don't even mention the MKC models). I will argue here that Meyer's argument in fact does 
not apply to the free will theorem. 

The original Kochen-Specker argument is set up so as to render realist theories that repro- 
duce quantum-mechanical predictions impossible. From a realist point of view, each observable 
must be assigned a definite value such that one can unambiguously speak about actual prop- 
erties possessed by a single system. The argument then runs that with each point on the unit 
sphere S 2 in M 3 (more precisely, to each ray in IR 3 ) there must be an observable associated with 
this point. Given that the definite value assignment should satisfy certain rules to reproduce 
the quantum-mechanical predictions, one obtains a contradiction. In fact, it is sufficient to 
consider only a finite subset of points in S 2 to obtain a contradiction. The loophole in this ar- 
gument is that if one takes into account the finite precision of measurement, one can no longer 
maintain the argument that indeed every point in S 2 should coincide with an observable. In 
fact, it is enough to consider some dense countable subset. 

The Free Will Theorem is different in spirit from the Kochen-Specker Theorem in that it 
does not start from the assumption of a realist approach toward nature. Instead, it focuses 
on the possibility of determinism and only assumes that measurements have outcomes. So 
no assignment of definite values to observables is required, but instead an assignment of 
definite values to measurement results is called for. Then, although not every point in the 
uncolorable subset S' of S 2 may correspond to an actual observable, such points do denote 
possible experiments that can be performed, and hence a definite value assignment to them is 
required. 

As seen, the MKC models respond to this situation in a peculiar way. Each time a 
measurement is performed, corresponding to a triad in the uncolorable subset, they claim 
that what in fact is measured are the values of some observables that together correspond 
to a triad close to the original triad one set out to measure. The indeterminacy of quantum 
mechanics is then retrieved by a probability distribution on the hidden states and hence on 
the values possessed by the actual observables. For an uncolorable subset S', there are triads 
x,y,z and x,y',z' such that the observable actually measured if one wishes to measure a 2 
is necessarily a different one in each of the triads possessing a different value. This is how 
the MKC-models satisfy the SPIN axiom. However, to satisfy the TWIN axiom, the models 
introduce a form of non-locality. Indeed, after Bob has performed his measurement, the state 
of the entire system changes in such a way that Alice's measurement will automatically satisfy 
the TWIN and MIN axioms. At least, this is the case for the modified MKC-models proposed 



in Section 3.6 In the original models, a discrepancy arises similar to the ones encountered in 
Section [3.5.31 and Section [3.71 

This form of non- locality cannot be sidestepped, since the TWIN axiom requires that the 
value obtained upon measurement of a 2 is the same for both triads x,y,z and x,y',z'. That 



'Robust under taking the finite precision of measurements into account. 
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is, in the view of the MKC-models, it is required that different observables, which are close 
to each other, must be assigned the same definite value. Since the definite value assignments 
are highly discontinuous at certain (non-negligible) regions (as they must be, as proven in 
| App4| ) , the MKC-models cannot satisfy this criterion. 

This result is not surprising. Clifton and Kent [CKlJ already noticed that their models 
should be non-local. The ensuing action at a distance allows nature to select a triad that 
satisfies the quantum- mechanical predictions. It is, however, interesting to see that the ar- 
gument of non-locality turns from the requirement that 'each observable must be assigned a 
definite value independent of the measuring context' to the requirement that 'different observ- 
ables must be assigned the same definite value'. In a certain sense this also explains why the 
Free Will Theorem is robust. If locality is taken into account, the SPIN and TWIN axioms 
require that most observables close to each other are assigned the same value, which entails 
robustness. 

A more extensive treatment of why the free will theorem is robust is also made in [CK2J. 
However, the argumentation used there appears to be a bit vague. The starting assump- 
tion is that it may of course be the case that the axioms SPIN and TWIN are in fact only 
approximately true and are more likely to be of the form: 

SPINpp: Measurements of the squared (components of) spin of a spin-1 particle 
in three nominally orthogonal directions give the answerap^j 1,0,1 in some order 
with a frequency of at least 1 — es- 

TWINfp: For twinned spin-1 particles, suppose experimenter A performs a triple 
experiment of measuring the squared spin component of particle a in three nom- 
inally orthogonal directions x,y,z, while experimenter B measures the twinned 
particle b in one direction, w. Then if w happens to be nominally in the same di- 
rection as one of x, y, z, experimenter B's measurement will yield the same answer 
as the corresponding measurement by A with a frequency of at least 1 — ep. 

Then, assuming an upper bound for the finite precision of measurement, an upper bound is 
derived for 3ep + es using quantum theory. This confuses me, for why would one want to 
derive an upper bound for 3ep + es using an assumption (quantum mechanics) that already 
states that 3ep + £5 = 0? However, the upper bound derived by Conway and Kochen will still 
be useful for the following discussion. 

Experimental investigations of the SPIN and TWIN axiom will provide upper bounds for 
the parameters es and ep. These upper bounds in turn provide an upper bound f max for the 
frequency with which the conjunction of SPIN and TWIN will be violated. On the other hand, 
for deterministic theories an estimate for the lower bound fmin of the frequency with which 
experimental violations of SPIN and TWIN will appear can be made. Then, if fmax < fmin, 
the Free Will Theorem shows that a deterministic theory is not possible, even if the weakened 
versions SPINpp and TWINpp of SPIN and TWIN are assumed to be true. 

As to the bound f m i n , note that there are in fact 1320 = #(Ea X Eg) possible experiments 
that Alice and Bob together can choose from. If each of these experiments is assigned a definite 
outcome, at least one of these assignments must conflict with either the SPIN or the TWIN 



85 One may assume that there are intervals 7i, 7o containing 1 resp. such that a measurement result in I\ 
(resp. Jo) may be interpreted as the result 1 (resp. 0). 
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axiom. Consequently^] 

The estimate for f m ax is a bit more tricky; here, an assumption on the frequency with which 
Alice and Bob choose their experiments is required. For the sake of simplicity, it is assumed 
that each of them independently chooses an experiment at random. Let x,y,z denote the 
triad selected by Alice an m the direction chosen by Bob. The corresponding measurement 
results will be denoted by M x , M y , M z and M m - Whenever m £ {x,y,z} (nominally) 87 
M 

m,A denotes the measurement result obtained by Alice for the direction m. Furthermore, 
let [101] denote the set {(0, 1, 1), (1,0, 1), (1, 1,0)}. Using this notation, one finds 

/ m ax=PhTWINV-MIN] 

= P[((M x ,M y ,M z ) i [Wl})\/ (Mm ^M mA r\ me {x,y,z})} 
<V[{M x ,M y ,M z ) i [101]} +F[M m ^M mA \me{x,y,z}]¥[me{x,y,z}} 
<e s + e T P [m € {x,y,z}\ 

( 3 16 2 24 \ 4 
=eS + ET V33 40 + 33 4oJ =eS+ 55 £T ' 

where I used (implementing the randomness of the settings) that 16 of the 40 triads contain 
3 directions that may be chosen by Bob, and the other 24 only contain 2 directions that may 
be chosen by Bob. It follows that, if it can be derived from experiment that eg and €t are 
actually so small that 

4 1 

esH e T < , (190) 

55 1320 v ' 

the Free Will Theorem holds. Here, the calculation of Conway and Kochen comes in. They 

derived (assuming the quantum mechanics is true) that it is reasonable to assume that actual 

experimental tests of SPIN and TWIN will yield 

1 , \ 

3e T + es< , (191) 

b ~ 2900000 v ' 



which is clearly sufficiently small to entail (190). But of course this is all just theoretical 



sophistry, which only becomes relevant when actual experimental tests of SPIN and TWIN 
will be performed. 



4.4.4 What does the Free Will Theorem add to the Story? 
The Bell Inequality 

Roughly stated, the claim made by the Free Will Theorem seems to be that determinism is 
a logical impossibility if one accepts certain forms of free will, locality, and some quantum- 
mechanical predictions. This claim strongly resembles the main conclusion of Bell's impossi- 
bility proof for contextual hidden variables. At first sight, the conclusions drawn from Bell's 
inequalities seem to be much stronger, since also certain types of indeterministic hidden- 
variable theories appear to be excluded. However, the assumptions made to derive a Bell-type 

86 Conway and Kochen derive here the frequency 1/40. Somehow, they come to the conclusion that a 
violation of SPINATWIN necessarily results in a violation of SPIN. I don't see why this should be true. 

8r It is reasonable to assume that Alice and Bob have agreed upon when this conclusion may be drawn and 
when not. 
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inequality turn out to be stronger too (or at least different). A comparison may be made by 
translating the Bell-type argument in the language of the Free Will Theorem. 

In the derivation of the Bell inequality for coupled spin-^ particles, certain analogues of 
the following axioms may be recognized: 

SPIN': Measurements of the spin of a spin-^ particle in any direction always gives 
one of the answers -1 or 1. 

TWIN': For twinned spin-^ particles, suppose experimenter A performs an exper- 
iment of measuring the spin of particle a in the direction x a , while experimenter 
B measures the spin of the twinned particle b in one direction, Xf,. Then if Xf, hap- 
pens to be the same direction as x a , experimenter B's measurement will necessarily 
yield the opposite answer from the corresponding measurement by A. 

The MIN-axiom for this system would read as follows: 

MIN': Assume that the experiments performed by A and B are space- like sepa- 
rated. Then experimenter B can freely choose any direction xj,, and a's response is 
independent of this choice. Similarly and independently, A can freely choose any 
direction x a , and 6's response is independent of that choice. 

However, it should be noted that this doesn't directly translate to the notions of locality 
CILOC and OILOC and the notion of free will assumed to derive the Bell inequality (see the 



discussion at the end of Section 2.3.4). This is partly because MIN (and therefore MIN') uses 
very general terms like 'independent' and 'freely' whose meaning only becomes apparent by its 
use in the proof. On the other hand, the axioms CILOC and OILOC are of such a mathematical 
nature that it isn't quite apparent what they express exactly. In fact, their meaning is only 
mathematically evident within the framework of Kolmogorov's probability theory. Indeed, it 
seems to be necessary for the derivation of the Bell inequality to assume that the measure- 
theoretic approach towards probability theory is suitable for describing actual probabilities 
of events in the real world. Among physicists this has become widely accepted, but many 
logicians would beg to differ J^] In fact, one may even consider the violation of Bell inequalities 
by quantum mechanics as an indication that the measure-theoretic approach is indeed flawed. 

Overall, I think that the physical meaning of the various assumptions in the derivation 
of the Bell inequality is somewhat clouded by mathematical terms. For example, as seen in 
Section |3.7| the notion of factorizability plays a crucial role, but it is not easily motivated 
without the use of the mathematical framework of quantum mechanics. However, it should 
be mentioned that the passage of time has given people the opportunity to find implicit 
assumptions in the derivation of the Bell inequality. It is likely that in the future similar 
hidden assumptions will be found for the Free Will Theorem. 

Coupled Spin-1 Particles 



The main thrust of Lemma 4.2 was known long before Conway and Kochen came with their 
theorem IIR . jStaj . |BS2j . and it has even been stated that the Free Will Theorem indeed 
expresses nothing that wasn't already known |GTTZ| . At first glance, this does indeed seem to 



gives a nice overview of logical probability. It also notes some of the axioms made by Kolmogorov 
that may be relaxed. 
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be the case. However, a closer inspection shows that there are important differences between 
the Free Will Theorem and these earlier articles. 

To obtain a clear view of these differences, it is good to first consider the similarities. In 
all articles one may distinguish three assumptions AS1, AS2, AS3. The argument may then 
be arranged to have the following abstract form. 

(i) Because of AS1, all 40 x 33 = 1320 experiments must be assigned a definite outcome. 

(ii) Because of AS2, each of the 1320 individual outcomes must be in accordance with 
quantum-mechanical predictions. 

(iii) Because of AS3, the assignments must be non-contextual. 

(iv) Because of the Kochen-Specker Theorem, this is not possible and therefore, the three 
assumptions AS1, AS2 and AS3 cannot all be true. 

In the Free Will Theorem, the assumptions are as follows: 

AS1: MIN and (local) determinism. 

AS2: MIN, SPIN and TWIN. 

AS3: MIN. 

In the articles [HRJ, |Sta| and [BS2j the assumptions are not formulated in this form, and 
it takes some more work to recover this line of reasoning from these articles. Also, none of 
the articles refer to any specific set of experiments, which makes it harder to construct actual 
experiments to test the theorems. In |HR| and [StaJ, (i) is partly motivated by an appeal to 
realism: 

"We shall be concerned with the sort of realism which asserts at least that at all 
times and in all states every physical magnitude which pertains to the system has 
some value" (HRj p. 482] 



And also in [ Staj it is noted that value definiteness is motivated by 'classical realism' 89 It is 
assumed that the correlation between possessed (existential) values and outcomes of experi- 
ments is given through the rule of Faithful Measurement: 

FM: "Any measurement of a physical magnitude Q reveals the value which Q had 
immediately prior to the measurement" |HR1 p. 483] 

Without such a rule, a theorem can only exclude certain existential behavior for a theory, 
which is of course of not much interest if it is not related to any empirical behavior. Stairs 
is mainly concerned with existential behavior and therefore doesn't address the notion of 
experimental outcomes. Brown and Svetlichny have a more direct approach to (i), which is 
similar to the one followed by Conway and Kochen: 

"In a deterministic h.v. theory, once all the values of the relevant parameters are 
fixed at t, the predicted outcome of a measurement of any observable ^4-we shall 
denote it by [A] 1 - is uniquely determined." [BS2, p. 1380] 



89 Part of the investigation of Stairs concerns the question whether or nor value definiteness is a necessary 
condition for realism. A form of realism in which this condition may be relaxed is termed 'quantum realism' 
by Stairs. 
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However, Brown and Svetlichny are not very clear about their motivation for stating that the 
outcome for any observable should be determined. In fact, in a deterministic theory this is 
only required to hold for the observables that are determined to be measured. The only two 
reasons I can think of for requiring that the claim should also hold for other observables, is 



an appeal to either realism, or free willP^J Since Brown and Svetlichny state that 



"it is not required that L4] , for every observable A, be interpreted to represent 
an objective element of reality [. . . ] associated with A, which the measurement 
process somehow faithfully reveals at t." [BS2 , p. 1384], 

I take it that they implicitly rely on a notion of free will. 

For step (ii), it is necessary to make assumptions on what constitutes (a part of) the set 
of observables if it is assumed that measurements reveal the values of observables. In all 
three articles [HR, Sta, BS2J it is assumed that observables can be associated with self-adjoint 
operators on a Hilbert space, at least, for the particular system of two spin-1 particles. Brown 
and Svetlichny actually (implicitly) require that all self-adjoint operators can be associated 
with an observable by relying on Gleason's lemma in their proof. The other proofs rely on the 
Kochen-Specker Theorem and therefore only require a finite subset of self-adjoint operators 
to correspond to observables. 

In [HRJ, (ii) is motivated by assuming functional relationships between the observables 
and the Value Rule: 

VR: If for a quantum-mechanical state if) it holds that 

<^A({a})V> = 0, (192) 

then Ac (A) ^ a for all measuring contexts C, for all A that are supposed to occur 
in the state determined by i/j. 

Together with FM this implies that individual outcomes of the 1320 experiments are in ac- 
cordance with quantum- mechanical predictions. This seems trivial at first sight, but it should 
be noted that actually a modified version of FUNC is assumed to hold for the observables, 
which allows context uality. I will formulate this here in terms that concur with the ones I 



used throughout this thesis (see Section 2.3.4). 



FUNC*: Let Ai,A2 and .A3 be three observables corresponding to operators 
A\ , A2 and A3 such that 

A 1 = f 1 (A 3 ), A 2 = f 2 (A 3 ) and A l = g(A 2 ) (193) 

for certain functions f\, f 2 and g. Assume that A3 is a maximal operator (i.e., its 
spectral decomposition consists only of 1-dimensional projections), then 

A^^jtii) = 9 (A M2 ,^ 3 }(^2)) (194) 

should hold for all A. 



90 Sometimes, a more direct assumption is used called 'counterfactual defmiteness', which simply means that 
unperformed measurements also have (potential) outcomes. However, on its own, counterfactual defmiteness 
doesn't seem a convincing assumption to me if it isn't motivated by either realism or free will. 
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For a single spin-1 particle, the operator 

S = s\a1 + s 2 a 2 + szo 2 z (195) 

is maximal for any triple of distinct real numbers Sx,s 2 ,S3. Consequently, FUNC* implies 
that in a context in which S can be measured, the observables a 2 , a 2 , a 2 should be assigned 
values in accordance with the FUNC rule. However, for the system of two spin-1 particles, S 
is no longer maximal, but the functional relationship can still be derived under the following 
extra assumption ' 



91 



Local Contextuality (LOCC): If T-L and %' are the Hilbert spaces for two 
spatially separated systems and S is an observable associated with the operator 
S (g) 1, where S is a maximal operator on H, then 



A 



{S,Ai}( S ) = X {S,A2}(S) (196) 

should hold for all hidden states A and for all observables Ai,A2 corresponding to 
maximal operators Ai,A 2 . 



This LOCC assumption 92 can also be read in the MIN axiom of the Free Will Theorem. 

The FUNC* assumption, however, has no direct counterpart. In fact, one may argue that 

it is not well motivated at all in [HR| or |Sta| . Surely, if the FUNC relation is not satisfied 

for observables that can be measured simultaneously, the hidden-variable theory will lead to 

conflicting predictions whenever these observables are being measured. But this only motivates 

the idea that FUNC should hold for the observables that will actually be measured. Extending 

this to all observables may, again, only be motivated by appealing to the free choice of the 

experimenters. Indeed, realism can only motivate the requirement that the observables have 

an actual value, but cannot imply that these values should satisfy the FUNC rule (unless one 

adopts a many worlds interpretation). 

In any case, Brown and Svetlichny replace the FUNC* assumption with the following 
931 



one 



Excluded Joint Events (EJE): If, for a system described by the quantum- 
mechanical state ipt-, one has 

{%l) t , u, Aim ({ai})u, t ® A2 {{a 2 })4>t) = 0, (197) 

i.e., the probability of finding the value a\ when measuring A\ (g) 1 and the value 
a 2 when measuring 1 ®A 2 is zero, where A\ and A 2 are maximal operators on the 
individual Hilbert spaces, then either [A\ (g> 1]* ^ a± or [1 C^^b]* ^ a 2 , or both. 

It is sufficient for them to only consider observables that are locally maximal, because they 
view the triple experiments of Alice as the measurement of a single observable of the form of 



(195), which indeed is maximal. The measurements performed by Bob are not taken to be 



91 The assumption stated here is not entirely the same as the one used by Heywood and Redhead, which 
is known as ontological locality. However, the differences are extremely minor and irrelevant for the present 
discussion. 

92 It is here stated as a notion of locality, but it may also be viewed as a notion of global non-contextuality 
i.e., it states that contextuality is only allowed locally. 
93 A similar approach is followed by Stairs. 
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measurements of the squared spin but of the spin itself (i.e. a w in stead of ct^,), which is also 
associated with a maximal operator. Besides this assumption, they also rely on a notion of 
locality that is similar to the notion of LOCC. However, as with the FUNC* assumption, the 
EJE assumption for all A\ and A2 may only be motivated with the aid of the assumption of 
free choice of the experimenters. Indeed, events that are assigned probability zero in quantum 
mechanics may be determined never to occur in a hidden-variable theory. But this doesn't 
imply that either [A\ ® 1]* 7^ a\ or [1 (g)^] 7^ 02, or both. It would be sufficient to require 
that A\ ® 1 and 1 (3A2 will not both be measured whenever \A\ ® 1]* = a\ and [1 (g)^] = a 2 
(and this would indeed imply EJE if free will is assumed). 

In |BS2j . the third step (hi) now follows almost immediately. Indeed, using EJE it fol- 
lows that the values assigned to the measurement outcomes for Bob do not depend on the 
experiment performed by Alice. It also is derived from EJE that the value assignments for 
Bob should satisfy the FUNC rule (which is excluded by the Kochen-Specker Theorem and 
by Gleason's lemma). The argument by Heywood and Redhead is quite lengthy and relies on 
yet another assumption of locality, called Environmental Locality. It is not very interesting 
to discuss the details here. 

From the above discussion, several conclusions may be drawn about the differences between 
the Free Will Theorem and the earlier articles mentioned above. 

(i) Heywood and Redhead and Stairs rely on a notion of realism to motivate value definite- 
ness for outcomes of experiments, whereas Brown and Svetlichny don't motivate value 
definiteness that well at all. The Free Will Theorem relies on a strong notion of free 
will and the idea that measurements have outcomes to motivate value definiteness for 
all possible experiments. 

(ii) All earlier articles rely on the theoretical structure of quantum mechanics. That is, the 
theorems are all formulated and proven in terms of the mathematics of Hilbert spaces, 
the tensor direct product of Hilbert spaces, local maximal self-adjoint operators, etc. By 
the use of this language it isn't very clear if the proven statements actually apply to all 
possible hidden- variable theories. 

(iii) The earlier articles rely on strong abstract assumptions like FUNC* and EJE, which 
actually imply the clearer and weaker assumptions SPIN and TWIN. 

(iv) The abstract assumptions FUNC* and EJE are not well motivated and may perhaps 
only be well motivated by the same strong notion of free will assumed by Conway and 
Kochenr 4 

Even if the above conclusions don't convince the reader that the Free Will Theorem actually 
states something new and adds something to the foundations of quantum mechanics, I think 
that the Free Will Theorem does present a clearer argument. In each of the articles [HR| ISta| 
BS2J, either the starting assumptions are not stated very clearly, or it is not very clear from 
the proof in which steps these assumptions are used or why the other steps are not based 
on extra implicit assumptions. It should be noted, however, that on this point a lot of work 
remains to be done for the Free Will Theorem as well, especially for the ways it was formulated 
and proven in |CK2] and |CK4) (as also noted throughout this chapter). 

94 Actually, an appeal to this form of free will may already be found in |EPR| where Einstein, Podolsky and 
Rosen use it to motivate the incompleteness of quantum mechanics. However, there is no real consensus if this 
notion of free will is necessary to run their argument [Fin2] . 



90 



The Strangeness and Logic of Quantum Mechanics 



5 The Strangeness and Logic of Quantum Mechanics 



De kwantumtheorie leidt tot een logica waarin plaats is 
voor niet-weten. 



G. Vertogen 



5.1 The (In) completeness of Quantum Mechanics (Part II) 



The discussion in the previous chapters was rather about hidden variables than about quantum 
mechanics itself. The main conclusion that can be drawn from this discussion is that it 
seems impossible to resolve the Einstein-Podolsky- Rosen paradox (of Example |2, 2 \ in the way 
envisaged by those same people without the use of dubious assumptions (i.e. either non- 
locality or absolute determinism). The appropriate marginal note should always be that this 
conclusion is based on abstract mathematical considerations making use of numerous explicit 
and implicit assumptions. However, as it stands it seems to me that even after more than 
seventy years since [EPRJ, the standard realist approach has provided no satisfactory way for 
understanding quantum mechanics. To use a metaphor; it is raining in the realist quantum 
world. And as long as this is the case, one might as well go for a stroll in the Copenhagen 
quantum garden. Indeed, throughout the development of quantum mechanics Bohr maintained 
that the theory is in fact complete. Perhaps the conflict between Bohr and Einstein can be 
resolved by reaching an understanding of why Bohr came to this conclusion. 



Bohr's first response to the Einstein-Podolsky-Rosen paradox quoted in Section 2.2 has 
puzzled many minds over the years. In terms of Example |2.2| it seems that Bohr is saying 
that the phenomenon "spin along the z-axis of the second particle" before the measurement 
on the first particle is a phenomenon different from the one after that measurement, since the 
experimental context has changed. This seems somewhat reasonable, since the prediction of 
the value of the spin along the z-axis of the second particle requires the measurement on the 
first particle. Note that this does not have to be seen as altering the observed system. That 
is, one may still think that the second particle remains undisturbed under the measurement 
of the first particle. What merely changes is the measuring context. But then, if the system 
remains unchanged, why is it described as being in a different state (as a consequence of the 
von Neumann postulate)? What is the notion of a 'state of a system' according to Bohr? 
Bohr answers these considerations in the following way: 

"In fact the paradox finds it complete solution within the frame of the quantum- 
mechanical formalism, according to which no well-defined use of the concept of 
"state" can be made as referring to the object separate from the body with which 
it has been in contact, until the external conditions involved in the definition of 
this concept are unambiguously fixed by a further suitable control of the auxiliary 
body." |BoTl3l p. 21] 

Hence it seems that according to Bohr the quantum-mechanical state of a system only has 
a meaning within a specified measuring context. In the quantum-mechanical formalism one 
should therefore not speak of the state of a system, but instead, of the state ip relative to each 
possible measuring context. But isn't this bluntly accepting that the quantum-mechanical 
description is incomplete? If the quantum-mechanical formalism does not provide one with 
the state of a system, shouldn't one search for a theory that does? But according to Bohr, 
this would be asking the wrong question. To say that there would be such a thing as the state 
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of the system would be to say that different measuring contexts could be compared with each 
other in an unambiguous way. The impossibility of doing so may be seen as the core lesson 
Bohr draws from quantum mechanics. In his own words: 

"[T]he impossibility of subdividing the individual quantum effects and of separating 
a behavior of the objects from their interaction with the measuring instruments 
serving to define the conditions under which the phenomena appear implies an 
ambiguity in assigning conventional attributes to atomic objects which calls for 
a reconsideration of our attitude toward the problem of physical explanation." 
[Bohil p. 317] 

Thus phenomena are only well-defined within a specified measuring context. Bohr consid- 
ers this peculiarity of quantum mechanics to be fundamental, and this leads him to introduce 
a new philosophical concept: phenomena that require different experimental setups may be 
defined as complementary. Bohr states that information about the same object obtained by 
different experimental arrangements is complementary. It should be noted that Bohr places 
this philosophical concept prior to empirical experience. For Bohr, the fact that different mea- 
suring devices are required to measure different aspects of the same object is only considered 
empirical evidence for complementarity: 

"Such empirical evidence exhibits a novel type of relationship, which has no ana- 
logue in classical physics and which may conveniently be termed "complementarity" 
in order to stress that in the contrasting phenomena we have to do with equally 
essential aspects of all well-defined knowledge about the objects." |Boh41 p. 314] 

But as a philosophical concept it stands on its own: 

"[CJomplementarity presents itself as a rational generalization of the very ideal of 
causality." |Boh4l p. 317] 

However, Bohr does not explain why the notion of complementarity is obvious and nec- 
essary. The vagueness of this notion also doesn't help much to see why quantum mechanics 
is complete and if so, in what sense. Especially the necessity of complementarity remains 
unclear, and at first sight it just seems yet another strange aspect of quantum mechanics that 
may be removed in a possible succeeding theory. In the end, it seems as if Bohr just tries to 
resolve the strangeness of quantum mechanics by "conveniently terming it complementarity". 
This unsatisfied feeling is not new, of course. In 1963 even Bohr's ally Rosenfeld had the 
following to say: 

"Complementarity is no system, no doctrine with ready-made precepts. There is 
no via regia to it; no formal definition of it can even be found in Bohr's writings, 
and this worries many people." |WZ| p. 85] 

Personally, I think that it is only after the search for a hidden-variable theory has failed 
that some more meaning can be found in the notion of complementarity, though not necessarily 
in favor of Bohr's interpretation. The principle of complementarity may only shed some light 
on the matter if one could find a philosophical argument for this principle, preferably based 
on considerations outside quantum mechanics. If this can be done, it will be most likely that a 
found notion of complementarity will differ from what Bohr had in mind. In the remainder of 
this Chapter, I will undertake a short investigation of the strangeness of quantum mechanics 
to finally come to a proposal for a philosophical foundation of complementarity. 
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5.2 Quantum Mechanics as a Hidden- Variable Theory 

It remains fascinating that science works in such a way that a theory like quantum mechanics 
may emerge, differing fundamentally from a priori notions of what a theory should be. The 
structure and requirements imposed on a hidden-variable theory that lead to impossibility 
proofs seem so reasonable that it is remarkable that they cannot be met. In the words of Bell 
[Bel2j: "That so much follows from such apparently innocent assumptions leads us to question 
their innocence." In fact, one may wonder if quantum mechanics avoids the objections raised 
against possible hidden- variable theories (and if so, how). Can quantum mechanics itself be 
interpreted as a hidden-variable theory*^] 

In quantum mechanics the pure states are given by vectors ifi in a Hilbert space T~L. These 
states do not determine the value of all observables, but instead they assign a probability to 
possible measurement outcomes for each observable in such a way that with each observable 
one can associate a probability space in the following way: 

Lemma 5.1. For every observable A corresponding to the operator A with spectrum o~(A) and 
every state ifi, the triple (a(A), Y,a, P^a) is a probability space, where Y,a is the a-algebra of 
all Borel subsets of cr(A) and P^, a is defined by the Born postulate: 

P^ a (A) = ct(^Ma(A)^>, VAGE4. (198) 



The proof is straightforward and well-known and therefore I omit it here. Although this 
may be an unsurprising result, it is significant for the present discussion. Apparently, for 
each observable separately, quantum mechanics behaves like a hidden variable-theory. That is, 
one may choose one observable (or, more generally, a set of observables whose corresponding 
operators mutually commute) and implement it as corresponding to an element of physical 
reality in the sense of Einstein-Podolsky-Rosen. One may then adopt the view that this 
specific observable does have a definite value at all times and adopt an ignorance interpretation 
towards the quantum-mechanical state of the system. This view plays an important role in 
so-called modal interpretations of quantum mechanics |DD| . As far as I know, none of these 
interpretations solve all the problems encountered in the earlier Chapters in a satisfactory 
way. For example, (the modal interpretation of) Bohmian mechanics (in which position is the 
special selected observable) is still non-local. 

The Kochen-Specker Theorem shows that it is impossible to embed all the probability 



spaces in Lemma 5.1 for all A into one all-embracing classical probability space in a satisfactory 
way] 96 1 At least, not without resorting to context uality. Surprisingly enough, even though 



the Bohrian interpretation emphasizes the incompatibility of different measuring contexts, 



quantum mechanics itself is not contextual in a certain sense. As seen in Lemma 5.1, for a 
fixed observable the probabilities assigned to different events do not depend on the measuring 
context. More specifically, if Ai,A% and A3 are three observables, corresponding to the 
operators A\, A2, A3, such that [^1,^3] = [^2,^3] = but [Ai,^] 7^ 0, then (using the 



95 The idea of this section was inspired by a small paragraph in |See2j . a more formal investigation of this 
question was carried out in |BB| . My discussion on contextuality is based on some remarks made in |Mer3| . 

96 It was shown in Chapter [3] that this is possible if one restricts to a certain dense subset of all self-adjoint 
operators. However, the present discussion focusses on quantum mechanics itself and from this point of view 
there is no motivation to deny the reversal of the observable postulate and the Kochen-Specker Theorem 
applies. 
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notation of Section 2.3.4) one has 



(199) 



This is a consequence of the fact that probabilities assigned to events in quantum mechanics 
are in a certain sense blind to the actual observable considered: that is, for any pair of 
observables Ai,A 2 corresponding to the operators A\,A 2 with Ai G and A2 G Xa 2 such 
that /x J 4 1 (Ai) = fj,A 2 (^2), for any state tjj one has 



P^[iieA 1 ]=P^[i 2 6A 2 ] 



(200) 



irrespective of whether or not A\ and A 2 commute. Indeed, in quantum mechanics propositions 
of the form A G A fall into equivalence classes: 



{A x G Ai) ~ (A 2 G A 2 ) 



■/iAi(Ai) = /iA 2 (A 2 ). 



(201) 



Of course, this non-contextuality cannot be extended to the actual results of a measurements 
if A\ and A% are incompatible. However, this cannot be considered a contextual aspect of the 
theory, but is rather a consequence of the indeterministic character of the theory. 

Thus far, quantum mechanics seems to behave pretty decently as a stochastic hidden- 
variable theory. But certainly quantum mechanics violates the Bell inequality and it must 
therefore be non-local. To investigate this, quantum theory has to be translated further into 



the language of stochastic hidden variables (Section 2.3.4). Although usually the pure states 



are given by the one-dimensional projections, in this setting the pure states are given by the 
density operators. Indeed, to each observable A a density operator p assigns a probability 
measure on the space (ct(j4),E^) for each measuring context C with A G C in the following 
way: 

S A 3 A ^ P C [A G A|p] = Tr( P/M (A)), (202) 



which is indeed a probability measure (this follows from Lemma 5.1 ). The rule for condition- 



alizing is given by the von Neumann postulate. In this case, equation (97) becomes 



P C [A 2 G A 2 | Ai G Ai,p] = P c [A 2 G A 2 \ P '] 



where 97 



1 



T*(p/i Al (Ai)) 



MAi({o})p/*Ai({o})- 



(203) 



(204) 



aeAi 

The macro states for this hidden-variable theory may be considered all to be of the form 
\x = 5 P in the sense of equation (101). 



The notation introduced here allows one to check if quantum mechanics satisfies the locality 
conditions OILOC and CILOC. CILOC is clearly satisfied, since quantum mechanics is non- 
contextual, as seen above. Therefore, since quantum mechanics does violate the Bell inequality, 
it must violate OILOC. This is indeed the case. Consider again the state ip = ^(0, 1, -1,0) 

in the Hilbert space C 2 <g> C 2 as in Example 2.2 and the measuring context C\ 2 from Lemma 
12.151 One then has 



P Cl2 [a ri = \\P^\ = Tr(P i) (P r+ 1)) 



(205) 



97 In case it worries the reader that Tr(p/iAi (Ai)) may equal zero, note that in that case the probability 
of finding the result A\ G Ai given the state p is zero, too. If one interprets events with probability zero as 
events that cannot occur, the situation where Ti(pfj,A 1 (Ai)) = will never occur. In other cases, one may find 
it satisfactory to assign the value zero to Pc[^2 G A2I -4i G Ai, p\. 
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but 



TVGfy(l®P r+ )) v 1 ) (206) 

= 2Tr(P^(P r+ ®P r+ )) = 0. 

It would therefore seem that quantum mechanics is indeed non-local in this specific sense. But 
that conclusion seems be a bit rash. What does a violation of OILOC imply? There is in fact 
much discussion on the issue whether or not a violation of OILOC implies non-locality, which 



I will not discuss here. 98 Instead, I will only make a few remarks. 

From an information-theoretic point of view it would seem logical that new information 
alters the probabilities one should assign to events. However, according to the state postulate, 
the state tp should already contain all the information necessary to describe the system, since 
it is supposed a complete description of the system. It can only be maintained that both the 
states before the measurement and after the measurement provide a complete description of the 
system if the system at hand has been changed under the influence of the measurement. Since 
the description of both subsystems has been altered after the measurement, both subsystems 
must have been influenced by the measurement, which implies an action at a distance. It is 
in this sense that a violation of OILOC implies non-locality. 

This line of reasoning is appears to be completely satisfactory, but there are a few loopholes. 
In fact, the Copenhagen interpretation provides a way out. One may consider the wave 
function to be a complete description of the system and maintain that the second particle 
remains undisturbed under the influence of a measurement, provided one no longer considers 
the wave function as a property possessed by the system, but as a description of the system. 
Then, after the first measurement, what changes is not the system but our knowledge about 



the system. 99 Indeed, after the first measurement the observer has gained new information 
about the system, but this new information did not have any ontological value before the 
measurement. 

This may sound like a dispute of the reality of the system and an encouragement of the 
view that the measurement brings into being certain properties of the system. This is indeed 
a form of the Copenhagen interpretation often heard, but it is not a necessary conclusion. All 
that is maintained is that certain notions used to describe a system (i.e. the wave function, 
but also observables like spin, position and momentum) are not to be considered possessed 
properties of the system but rather aspects of the way the system is described. That is, in 
contrast with Einstein's view, these aspects are not supposed to correspond to elements of 
reality. In the words of Bohr: 

"The entire formalism [quantum mechanics] is to be considered as a tool for deriving 
predictions, of definite or statistical character, as regards information obtainable 
under experimental conditions described in classical terms and specified by means 
of parameters entering into the algebraic or differential equations of which the ma- 
trices or the wave-functions, respectively, are solutions. These symbols themselves 
[. . . ] are not susceptible to pictorial interpretation; and even derived real functions 
like densities and currents are only to be regarded as expressing the probabilities 
for the occurrence of individual events observable under well-defined experimental 
conditions." |Boh4| p. 314] 



98 |Finl| . [Jar] , |vF3l Ch. 4], [SET] , [Maull Ch. 4] is a (too) short list of relevant publications. 

"i will later return to the question of what is actually meant with 'the knowledge about the system'. 
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There are some naturally appealing features to this viewpoint. What is being questioned 
here is not so much the existence of reality but rather the direct correspondence between 
reality and the way we humans perceive reality. This correspondence is assumed explicitly in 
[EPRJ by introducing a sufficient condition for the existence of an element of physical reality 
(EOPR). However, such an assumption is of a metaphysical nature and naturally a physical 
theory may only describe the way we humans perceive reality. 

A few remarks are in place. First, the viewpoint sketched here may be termed 'the real- 
ist interpretation of Bohr' which is also promoted by Folse in |Fol] . However, there are also 
analyses of the philosophy of Bohr that portray Bohr as an anti-realist. In [ Lan2| both these 
viewpoints are considered. Second, from the viewpoint sketched above, the notion of com- 
pleteness in [EPR] no longer makes sense. Instead, there may be other notions of completeness 
such that quantum mechanics may be considered complete. In fact, Bohr maintained through- 
out the development of quantum mechanics that it is a complete theory. However, there is no 
consensus about what he meant by that. At least, the impossibility proofs establish that it is 
a non-trivial matter to extend quantum mechanics to a theory that would be 'more complete' 
in a satisfactory way. If one accepts the impossibility of extending quantum theory, then the 
theory may be considered complete. In |EBF| the notion of completeness is discussed in more 
detail. 

Finally, I'd like to make a remark that is perhaps somewhat controversial. In the experi- 
ment, after Alice has performed her experiment, her description of the entire system has been 
altered by the von Neumann postulate. However, Bob still uses the 'old' description. It seems 
that it is natural to assume that Bob's description is 'flawed'. However, this conclusion seems 
to rely on the assumption that Alice actually describes the state of the system. There are (at 
least) two ways out of this dilemma. The first is to state that, although Alice's description is 
not the state of the system, it is the objective description of the system. From this point of 
view, Bob's description is indeed flawed. The other way out is to assume that both Alice and 
Bob are right and there is no such thing as the objective description of a system, but only 
subjective descriptions. Indeed, from Bob's perspective, his state gives a complete description 
of the system, in the sense that it contains all the information available to him. 100 



5.3 Quantum Logic and the Violation of the Bell Inequality 

In the previous section it was established that, if one views quantum mechanics as a stochastic 
hidden-variable theory, the violation of the Bell inequality by quantum mechanics may be 
attributed to a violation of OILOC. If OILOC is considered as a notion of locality, it is clear 
what a violation of it means and that it is disturbing. But, as I argued, this viewpoint is 
not necessary. On the other hand, it is not clear what OILOC means precisely if it is not 
interpreted as a notion of locality and it is not at all clear how one should understand a 
violation of OILOC in that case. This problem is not easily solved, and I will not make an 
attempt here. What I will do is establish another proof of the Bell inequality based on logical 
consideration instead of philosophical ones. From this proof I will explain how I think Bohr 

100 These two viewpoints are not necessarily mutually exclusive. In |Myr| a modification of the quantum- 
mechanical state is introduced in such a way that its behavior is somewhat like that of the electromagnetic 
field in special relativity; the notion of the state is objective, but the way it should be interpreted depends on 
the reference frame of the observer (as does its time-evolution). Roughly, the collapse of the state propagates 
with the speed of light. So directly after the first measurement, both Alice's and Bob's description of the 
system are correct descriptions. 
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would consider it to be possible to violate the inequality, how orthodox quantum mechanics 
violates this inequality, and how I think the violation should be interpreted. 

Before I continue, there is an important remark to be made. The Bell inequality derived 
in the following lemma is one of probabilistic logic and not one of probability theory. The 
first is a branch of logic, whereas the second is a branch of mathematics. The mathematical 
tools used in probability theory are therefore not used and hence are not necessarily assumed 
to be true. For example, it is not assumed that all probability functions can be associated 
with finite measures on some measurable space. Instead, in probabilistic logic a probability 
function P is a rule that to each proposition A assigns a value P(A) G [0, 1] that denotes the 
probability that the proposition is true. There are many forms of probabilistic logic, each 
assuming different rules that should hold for the function P. I will assume the following rules: 

1) If A -»• B, then P(A) < P(B) 
2) If A and B are mutually exclusive, i.e., A A B — > _L, (207) 
then P(A V B) = P(A) + P(B) 

It is further assumed that the propositions obey all the relations of classical logic. 



Lemma 5.2. For each probability function P that satisfies (201), and for all propositions A\, 
B\, A2 and £2 one has 

P(ii A £1) < P(^i A £ 2 ) + P(^2 A £1) + P(^ 2 A -.fl 2 ) (208) 

Proof: 

Using the rules assumed earlier, the inequality follows by simply expanding: 

P(Ai A Bi) = P(Ai A Bi A (B 2 V -.B 2 )) 

= P((Ai A Bi A B 2 ) V (Ai A Bi A -.B 2 )) 

= P(Ai A Bi A B 2 ) + P(Ai A Bi A -.B 2 ) < P(Ai A B 2 ) + P(Bi A -.B 2 ) 

= P(Ai A B 2 ) + P(Bi A ^B 2 A (A 2 V -iA 2 )) (209) 

= P(Ai A B 2 ) + P((Bi A ^B 2 A A 2 ) V (B 1 A ^B 2 A -tA 2 )) 

= P(Ai A B 2 ) + P(Bi A ^B 2 A A 2 ) + P(Bi A ^B 2 A -.A 2 ) 

< P(Ai A B 2 ) + P(A 2 A Bi) + P(-.A 2 A -.B 2 ) 

□ 



Typical propositions that play a role in physics are of the form [A G A], where A is an 
observable and A some subset of M. (preferably Borel). Here, [A G A], may be understood as 
a shorthand notation of the statement: "The answer to the question "Does the value of A lie 
in A?" is yes" (see also the discussion just after the proof of Lemma 2.12 in Section 2.3.3). 

By considering such propositions, it is not hard to show that quantum mechanics violates 
the inequality (208). It is again sufficient to look at the pair of spin-^ particles in the state 



fj} = |V2(0, 
the form 



1, —1,0). Note that every spin operator along some axis r in the xy-plane is of 



cos 



cos(0 r ) + i sin(# r ) 



V) 



1 sin 



V) 



(210) 
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Now let (Xa be the spin along the axis a for one particle and a? the spin along the axis b for 



the second particle. Using the notation of Example 2.2 one finds 

VjWP G {1} A 4 2) e I 1 }] =(V>, p a+ ® A+V>> 

_ (ei <8) e 2 - e 2 <8> ei, (P a+ <g> P&+)ei ® e 2 - e 2 ® ei) 
~~ 2 

_ -Pq+ll-Pfe+ 22 ~ -^a+ 12-^+21 -^Q+22-^ > 6+ll ~ ^+21^+12 

1/ 2 ( 211 ) 

1 - (cos(0 a ) - isin(0 a )) (cos(6> b ) -Msin(0&)) 



+ 1 - (cos(0 a ) + i sin(6> a ))(cos(6>b) - i sin(9 b/ 
i (1 - cos(# a - 6 )) • 



Similarely, one finds 



Now take 



G {"I) A G {-1}] = J (1 - cos(# a - e h )) . (212) 
9 ai = 0, 6> a2 = — , 9 bl = ir, 9 b2 = — , (213) 



and set 

Ai = € {1}], -A,- = [ag> € {-1}], B^^GW], = [ag> € {-1}], (214) 

for j = 1,2. With these identifications and the expressions (211) and ( 212| ), the inequality 



( 208 ) would now read 



I = P(Ai A Bi) < P(Ai A B 2 ) + P(A 2 A Bi) + P(-A 2 A -.B 2 ) = J + J + J, (215) 
Z o o o 

which is of course false. 



The derivation of the Bell inequality (208) is not based on philosophical concepts like re- 



alism, causality and locality, but on purely classically logical considerations. Of course, the 
same inequality can also be derived assuming the structure of a classical probability space and 
associating with each proposition some subset of this space. However, since such assumptions 
were not made, the usual objections made against the derivation of the Bell inequality, or 
explanations of its violation, can no longer be applied in a direct manner. The easiest objec- 
tion against the logical derivation is that it implicitly assumes non-contextuality. Indeed, it is 
assumed that the proposition Ai is the same in both the contexts ,aj^} and , cr^}, 
which is denied in viable hidden-variable theories. But, as argued in Section [5. 2 1 quantum me- 
chanics is in a certain sense a non-contextual theory, at least in such a way that the argument 
of contextuality cannot be applied without making additional assumptions. Bohr's answer to 
the paradox would probably be of a more philosophical nature. The derivation of the inequal- 
ity relies on introducing propositions which involve speaking about both observables and 
° r both and aj^ . However, such observables are complementary. A proposition like 
Ai A Bi A B 2 relies on an abuse of language and is therefore ambiguous. This makes the whole 
derivation ambiguous and hence incorrect. However, although this Bohrian line of reasoning 



98 



The Strangeness and Logic of Quantum Mechanics 



may sound appealing, it is not based on quantum mechanics as it is. There is no mentioning 
of the notion of complementarity in any of the axioms of this theory and also, there exists no 
derivation of this notion from the axioms. In fact, orthodox quantum mechanics can violate 
the inequality without resorting to any philosophical considerations. Each proposition in the 



derivation of (208) can be associated with a mathematical object in the theory in a consistent 
way. This association was first conceived in 1936 by Birkhoff and von Neumann |BvN| . In 
this article it is shown that quantum mechanics can be viewed so as to obey a form of logic 
noticeably different from the logic that is customary held to be true (i.e. classical logic). This 
new form of logic is usually termed quantum logic. I will give a short derivation of this logic 
based on the one given in [IshJ. 

As noted earlier, typical propositions in physics are of the form [A G A]. In quantum 
mechanics, each such proposition can be associated with a projection operator, namely, ^(A). 



By (201 ), propositions fall into equivalence classes and there is a 1-1 correspondence between 
the set of equivalence classes and the set of projection operators V(T-L). Following Isham in 
[IshJ, the following meaning is given to the statement that P is true, for some P G Vi^H): 



Definition 5.1. For a quantum system in the state tp each proposition associated with the 
projection P is true if [ip,Ptp) = 1, i.e., a proposition [A G A] is true if the probability of 
finding a value in A upon measurement of A is one. 

Naturally, with each proposition one can associate the set of all states for which this 
proposition is true. For a proposition associated with the projection P this set is precisely 
given by P%. Notice that PT~L is a closed linear subspace of T~L and that in fact all closed 
linear subspaces are of this form. This identification can be used to introduce the logical 
connectives. 

• Disjunction: The statement [A G A or A G A'] is associated with the set of all states 
with the property that a measurement of A will yield an element of A or an element 
of A' with probability one. This corresponding set of states will then be the closure of 
the set Sp(/i^(A) H, fj, A (A') ~H), where Sp denotes the linear span. This closed linear 
subspace is in fact equal to /i^(AuA') T~L. Generalizing this result for different operators 
yields the identification 



[Ai G Ai]^[^ 2 G A 2 ] = Sp(/iA 1 (Ai)^,/iA 2 (A 2 )^), (216) 
where the over lining denotes the closure of the set. 

Conjunction: The statement [A € A and A G A'] is associated with the set of all states 
with the property that a measurement of A will yield a result that is both an element of 
A and an element of A' with probability one. This corresponding set of states will then 
be ha(A) T~L fl/i^A') %. This is again a closed linear subspace given by /m(A fl A') Tl. 
Generalizing this result for different operators yields the identification 

[Ai G Ax] A [A 2 G A 2 ] = (i Al (Ai) U f>A 2 (A 2 ) H . (217) 

This is the set of states with the property that a measurement of Ai will yield an 
element of Ai with probability 1 and a measurement of A2 will yield an element of A 2 
with probability 1. 
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• Negation: The statement [A £ Aa\ is associated with the set of all states with the 
property that a measurement of A will yield a result that is not an element of A ^4 with 
probability one. This corresponding set of states is given by (^(A) . This is again 
a closed linear subspace, given by ha(A c )1-L, where A c denotes the complement in the 
set cr(A). So 

^[AeA}=fi A (A c )H. (218) 
With these identifications, one sees that every proposition can be associated with a unique pro- 



jection, irrespective of whether it concerns an elementary proposition 101 of the form £ A], 
or a proposition that is formed using several elementary propositions together with the con- 
nectives. The prepositional calculus then inherits the lattice structure of the set of projection 
operators. The partial ordering is given by 

P 1 <P 2 <^>P 1 UC P 2 U. (219) 

The top element T is given by the unit operator 1. This is taken to correspond with 'absolute 
truth'. Indeed, the set T~L corresponds with the set of states for which all propositions are true. 
Similarly, the bottom element _L is given by the zero operator O. Within this structure, the 
following rules hold for all propositions Ai,A2 and A3: 

Ai V (A 2 V A 3 ) = (Ai V A 2 ) V A 3 Associativity Ai A (A 2 A A3) = (Ai A A 2 ) A A 3 

Ai V A 2 = A 2 V Ai Commutativity Ai A A 2 = A 2 A Ai 

Ai V (Ai A A 2 ) = Ai Absorption A x A (Ai V A 2 ) = Aj ~*~ 

k\ V — 1A1 = T Complements Ai A — 1A1 = _L 

However, the relations 

Ai V (A 2 A A 3 ) = (Ai VA 2 ) A (Ai VA 3 ) Distributivity Ai A (A 2 V A 3 ) = (Ai A A 2 ) V (Ai A A 3 ) (221) 
no longer hold in general. In particular, the second and the sixth step in the proof of Lemma 



5.2 



do not hold for the particular case of the two spin-^ particles considered. 
There has been some discussion through the years about the question how significant 
the discovery of this so-called quantum logic is for the interpretation of quantum mechanics. 
Probably one of the most progressive views was advocated by Putnam in (Put]. In this article, 
it is argued that most of the problems concerning quantum mechanics would cease to exist if 
one adopts the viewpoint that quantum logic is in fact the only 'true' logic. Classical logic 
then re-emerges as a certain limit similar to the way Euclidean geometry appears as a special 
case of non-Euclidean geometry. In particular, Putnam argued that one could easily adopt a 
(non-classical) realist point of view towards systems, observables and measurements. 

It seems to me that history wasn't very kind to this viewpoint. There were many more 



people willing to criticize this viewpoint than willing to endorse it. 102 However, one of the 



most interesting articles due to Stairs |Staj . exposes some of the difficulties that in light of 



101 In logic, one may probably use the term 'atomic formula'. 

102 This simply seems to be the faith of controversial ideas as also seen earlier in the 'nullification' discussion 
in Chapter [3] Consequently, it seems to me that such ideas have a tendency to be forgotten before they are 
well understood. Indeed, Maudlin's "Tale of Qunatum Logic" |M au2| has a beginning and an end. It should 
be noted that one of the factors that may have played a role is that Putnam later distanced himself from his 
own ideas. 
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Putnam's thesis may be formulated as follows. Although quantum logic allows one to see 
why the Bell inequality (208) may be violated, it does not help much in understanding why 
the Bell inequality in Section |2.3.4| can be violated without engaging in the philosophical 
discussion involving realism and locality. But even so, what quantum logic does is replace 
the mystery of "Why can the Bell inequality be violated?" with the mystery of "Why can the 
law of distributivity be violated?" This last mystery is extensively discussed in |Dum| and it 
seems that there is no solution at hand. 

The quantum logical approach to the explanation of the violation of the Bell inequality 
(208) appears to be more precise than the Bohrian approach, since it exactly indicates which 
steps of the proof of Lemma 5.2 are flawed. But this approach also seems to immediately 
arrive at a philosophical dead end: the violation of distributivity. Indeed, from a philosophical 
point of view the Bohrian approach seems more appealing, even though it relies on the vague 
notion of complementarity. It does seem to be the case that the proof of Lemma 5.2 relies 
on ambiguous manipulations of the propositions. In fact, I believe that this discussion may 
provide us with a way to turn the situation around. Indeed, one should not search for an 
understanding of the notion of complementarity to explain why the proof of Lemma 5.2 is 



flawed, but instead, one should search for an understanding of the flaw of the proof to reach 
an explanation of the notion of complementarity. 

Approaches similar to this have been carried out several times, of course. For example, 
in [Hee] an explanation of complementarity is explored from a quantum-logical point of view. 
However, this is of course again replacing a mystery by another mystery. In the same way, 
one also sometimes hears explanations based on non-commutativity, which plays a role in the 
mathematical structure of quantum mechanics. But this idea is of course circular. For Bohr, 
position and momentum are identified with non-commuting operators because the observables 
are of a complementary nature, not the other way around. In fact, one may argue that 
explanations based on mathematical structures are hardly explanations at all, ever. It also 
seems to me that the quantum-logical approach doesn't capture the ideas of Bohr at all. The 
first abuse of language in the proof of Lemma |5.2| already appears in the first step, where 



(2) (2) 

the incompatible statements about the observables and are introduced by appealing 
to the innocent looking law of excluded middle. However, this logical law has been accused 
earlier of being applied 'recklessly' by Brouwer (the founder of Intuitionism), and it's worth an 
investigation to find out whether or not something similar is going on in quantum mechanics. 



5.4 Intuitionism and Complementarity 

To see if intuitionism can play a possible role in the interpretation of quantum mechanics it 
is better to first have a small look at the philosophy behind intuitionism. Intuitionism is a 
philosophy of mathematics, which opposes the Platonic idea that mathematical objects exist 
independently of the mathematician. Consequently, for Brouwer, there is no independent 
mathematical truth; propositions only become true when one experiences its truth i.e., if one 
has found a proof for them. For Brouwer, this is also what a proof is; a construction that 
one rehearses in one's head. It seems he never saw any reason to formalize this notion much 
further, but fortunately, his student Heyting did, thereby introducing intuitionistic logic |Hey| . 
Independently, a few years earlier Kolmogorov also had come up with a formalization of some 
of Brouwer's ideas |Koll| . These works have led to what is now known as the Brouwer-Heyting- 
Kolmogorov (BHK) interpretation of logical connectives. These are as follows: 

• A proof of A A B consists of a proof of A and a proof of B. 
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• A proof of A V B consists of a proof of A or a proof of B together with a rule that tells one 
of which statement one has the proof i.e., one can decide whether A is true or B is true. 

• A proof of A — > B consists of a rule that converts every proof of A to a proof of B. 

• A proof of -iA consists of a rule that converts every proof of A to a proof of = 1 i.e., a 
proof of -iA is a proof of A — > _L. 

In particular, a proof of A V ->A consists of showing that either A is true, or showing that A 
leads to a contradiction, and it is therefore not a triviality (as it is in the Platonic view). 

Brouwer himself was not that impressed by the introduction of intuitionistic logic. He 
viewed logic merely as the study of regularities that appear in the use of language |Bro] . 
But for him, performing mathematics is a language-free act. The only role language plays in 
mathematics is in communication, where one may only hope that expressing a proof in words 
leads to a similar construction in the mind of the person to whom you are explaining the proof. 
So for both Brouwer and Bohr, language may be seen as a necessary evil for communication. 
However, whereas for Bohr communication is seen as a necessary ingredient of science, for 
Brouwer communication wasn't that important. 



Of course, the BHK interpretation does not yet establish what an actual proof is. 103 but 
only states how one should read the logical connectives. However, what counts as a proof is 
irrelevant to the present discussion. What is interesting, is to see if these interpretations of 
connectives can be adopted to apply to physical statements, instead of mathematical ones. In 
particular, I'll first take a closer look at the inequality of the previous section. 



The abuse of language involved in Lemma 5.2 actually already appears in the inequality 



(208) itself. The four statements Ai A Bi, Ai A B2, A2 A Bi and — 1A2 A -1B2 all refer to different 



experimental situations, in each of which only two observables are being measured. At least, 



this is the assumption one must make if one wishes to apply equations (211) and (212). 



However, in practice, only one of these experimental situations can be actual, the others are 



hypothetical. This means that inequality (208) expresses a counterfactual statement, which 



may indeed be considered an abuse of language 
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In the proof of Lemma 5.2 counterfactual reasoning already enters in the first step. The 
reasoning is: 

"If instead of measuring and <jj^ one would measure , one would find either 
the result 1 or -1 i.e., one would be able to conclude B2 or -1B2." 

However, in the exact situation where one is measuring cri* and cr^ , one can neither draw the 

conclusion B2, nor the conclusion -1B2. Consequently, one cannot draw the conclusion B2 V-1B2 

either, that is (using the terminology of Brouwer), the statement B2 V — 1B2 is reckless, unless 

(2) 

one actually measures o"^ • So, the physical 'proof of a statement may be regarded to be the 
actual appearance of a measurement result. 

The obvious way out would seem to be to also measure <7& 2 . Then, if one also assumes 



that <7 a2 will be measured, every step in the proof of Lemma 5.2 is intuitionisticaly justified. 
But surely the inequality would still be violated by quantum mechanics? However, it turns 
out that this is no longer the case, since in this experimental setup, where on one side first 



103 What actually counts as a proof and what does not is one of the general questions in the philosophy of 
mathematics. A friendly book that looks at this question from a historic perspective is [Kra| . 
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This may be compared to the counterfactual reasoning that appears in Example 2.2 
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CTaP is measured and then o~^ 2 \ and on the other side first is measured and then cr^ 



the 



equations (211 ) and (212) will no longer hold in general. Although in the special case of (213) 



one may still show that 



P(Ai AB 2 ) =P(A 2 ABi] 



(222) 



one now finds that 



P(^A 2 A ^B 2 ) 
= P(^A 2 A ^B 2 A Ai A Bi) + P(^A 2 A ^B 2 A ->Ai A Bi) 

+ P(^A 2 A ^B 2 A Ai A -.Bi) + P(^A 2 A ^B 2 A -1A1 A -.Bi) 
=(iP, (P ai+ P a2 _P ai+ ® P bl+ P b2 _P bl+ )iP) + (ip, (P ai _P a2 _P ai _ ® P bl+ P b ^P bl+ )^) 

+ (V, {P ai + Pa 2 ~Pa 1+ ® P bl ~P b2 -P bl -)?P} + (V, (P ai -Pa 2 -P ai - ® P bl -Pb 2 -P bl -)lp) 

= A + + + l = A 

32 32 16 



(223) 



So in this case, the inequality (208) reads 



O — S T a T 



16' 



which is actually true. 



This example strengthens the believe that, in quantum mechanics, a statement of the form 
A V -iA may only be considered to be true if an actual experiment is performed that will decide 
whether A or -iA is true i.e., if one can find a 'physics proof of A V -iA. In this light, Peres' 
statement that "unperformed experiments have no result' 105 may be explained by saying that 



any statement A is neither true nor false unless one has a proof of one or the other i.e., unproven 
statements are not true (nor are they not true). Now, two statements A and B may be called 
complementary if one cannot simultaneously prove A V ->A and B V -iB. That is, two statements 
are complementary if they are not simultaneously decidable, ever. In quantum mechanics, 
this is for example the case if A is a statement about some observable A, and B a statement 
of some observable B whose corresponding operators A and B do not commute. This seems a 
triviality, but notice that in this way complementarity is defined in logical terms. Its intuitive 
use in quantum mechanics is merely a consequence of this definition. To make things clearer, 
let me give a more intuitive example. 

Consider the double slit experiment. The analysis given by Putnam in [Put] provides a 
clear view from a logician's perspective. Consider a photon that went through the barrier and 
let R be the statement "the photon is detected within a certain region R on the photographic 
plate". Further, let Si be the statement "the photon went through slit 1" and S 2 the statement 
"the photon went through slit 2". Classically, one would expect that the probability of finding 
the photon at a certain region R given that both slits are open is given by 



P(R|Si VS 2 ) 



^P(R|S!) + ^P(R|S 2 ) 



(224) 



105 This is the title of Peres' article |Perl| in which he advocates against the use of counterfactual reasoning. 
This opinion is also reflected in |Per2| and [Per4j . 
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This expression is easily derived if one assumes that one may arrange that P(Si) = P(S2) and 
that photons cannot go through both slits at the same time. Indeed, one then finds 



P(R|Si VS 2 



P(R A (Si V S 2 )) _ P((R A Si) V (R A S 2 )) 
P(Si VS 2 ) ~ P(Si VS 2 ) 



_P(RASi) P(RAS 2 ) _P(RASi) P(R A S 2 ) 
~P(Si VS 2 ) + P(Si VS 2 ) ~ 2P(Si) + 2P(S 2 ) 

= ^P(R|Si) + ^P(R|S 2 ). 

However, it is well known that with both slits open an interference pattern emerges on the 
photographic plate, which is not what is obtained if one takes the sum of the patterns that 
emerge with one slit closed. Putnam tries to adopt a realist point of view and therefore states 
that each photon that goes through the barrier, goes through exactly one slit. He blames the 
use of the law of distributivity in the second step, which is not generally true in quantum 
logic. 

Another often-heard conclusion is that the particular photon behaves as a wave instead 
of a particle in case both slits are open, and can therefore go through both slits at the same 
time. Both views are rather mystifying. From the intuitionistic point of view the derivation 
may in fact be considered to be correct. However, in this case the condition Si VS 2 means that 
one can actually decide whether Si is the case, or S 2 . Experiments have shown that in cases 
where one actually can make this decision, the interference pattern dissolves and equation 



(224) holds. Indeed, Putnam makes the starting assumption that in cases where a photon 
actually goes through the barrier, the statement Si V S 2 is always true. Feynman actually got 
it right from this point of view. About a similar experiment for electrons he concludes: 

"What we must say (to avoid making wrong predictions) is the following. If one 
looks at the holes or, more accurately, if one has a piece of apparatus which is 
capable of determining whether the electrons go through hole 1 or hole 2, then one 
can say that it goes either through hole 1 or hole 2. But, when one does not try 
to tell which way the electron goes, when there is nothing in the experiment to 
disturb the electrons, then one may not say that an electron goes either through 
hole 1 or hole 2. If one does say that, and starts to make any deductions from 
the statement, he will make errors in the analysis. This is the logical tightrope on 
which we must walk if we wish to describe nature successfully." |FLSI p. 37-9] 

Indeed, in experiments in which an interference pattern is found, nothing can be said about 
the path of the photons, whereas in experiments where the path of the photons is detected, 
the interference pattern dissolves. These two possible views on the behavior of photons are 
thus complementary. In connection with the logical definition of complementarity, one can say 
that statements about the specific slit through which a photon passes and statements about 
the wavelength of the photons (which can be derived from studying the interference pattern) 
are complementary; they are not simultaneously decidable. 

It is likely that the given logical definition of complementarity doesn't coincide with Bohr's 
ideas about complementarity and I do not dare to claim that this definition serves as a mag- 
ical key to a better understanding of the ideas of Bohr. However, the given definition does 
provide a way to take a new look at quantum mechanics based on considerations that do not 
depend on quantum mechanics. In this light, quantum mechanics presents itself as a theory 
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in which complementary statements naturally and necessarily arise (e.g. statement about the 
position and statements about the momentum of a particle are complementary). But quan- 
tum mechanics is only a specific example in which complementarity arises and other examples 
outside physics are quite likely to be possible, as also envisaged by Bohr. The present notion 
of complementarity also shows what care should be taken in counterfactual reasoning. Indeed, 
although one can envisage that in a certain situation A may be decidable and in another B may 
be decidable, one should be careful to note that one cannot, in general, draw the conclusion 
that both are decidable at the same time. That is, there is no problem with counterfactual rea- 
soning as such, but one should take care not to confuse counterfactual reasoning with factual 
reasoning. 



5.5 Towards Intuitionistic Quantum Logic 

In the previous sections some motivation was given to adopt an intuitionistic point of view 
towards quantum mechanics, and it was argued that this approach seems to be in line with 
parts of the Copenhagen interpretation. However, all these considerations only seem to float 
around quantum mechanics, but nowhere do they appear in the mathematical formulation 
of the theory. This is in strong contrast with, for example, the Bohmian interpretation of 
quantum mechanics or the MKC-models and the proposed axioms for these models discussed 
in Chapter [3j In fact, as it stands the mathematical structure of quantum mechanics rather 
advocates the adoption of quantum logic than of intuitionistic logic. It is not at all clear that 
a consistent intuitionistic interpretation of quantum mechanics is possible. Unfortunately, I 
cannot present a solution to this problem, but I will discuss some possible ideas that may lead 
to possible consistent interpretations in the future. The reader more familiar with intuitionism 
may be warned. The reasonings I use sometimes borrow ideas from intuitionism, but not 
consistently so. The mathematics used is primarily classical, but the interpretation of it is not 
always. 

Formally, the task is to find a new correspondence between propositions and mathematical 
objects in the theory such that these mathematical objects together form the structure of a 



Heyting algebra, which is the formal algebraic structure of intuitionistic logic. 106 A natural 
approach is to take a closer look at quantum logic. The first step taken by Birkhoff and von 
Neumann, where they associate propositions of the form [A £ A] with closed linear subspaces 
(or the projections on these spaces), seems a very natural one to hang on to, so the focus is 
on the derivation of the logical connectives. It seems plausible that their derivation contains 
some conceptual flaws, since a non-classical logic is derived using considerations based on 
classical logic. Indeed, the same approach when applied to classical physics leads to the 
construction of a classical propositional lattice (in the form of a Boolean algebra). 107 The 
derivation of quantum logic has of course been criticized several times over the years. An 
interesting contribution is due to Popper, who concludes: 

"It is of interest that the kind of change in classical logic which would fit what 
Birkhoff and von Neumann suggest [. . . ] would be the rejection of the law of 



106 A Heyting algebra is a bounded lattice L such that for all Zi, I2 G L there is an element denoted Zi — > Z2 
that is the greatest element that satisfies (Zi — > I2) A h < h- Negation (often named the pseudo-complement in 
this context) may then be defined by -iZ := Z — > _L. A Heyting algebra is a proper generalization of the notion 
of a Boolean algebra: every Boolean algebra is a Heyting algebra and a Heyting algebra is Boolean if and only 
if -1-1/ = I for all Z € L. 

107 'See for example [Ish, Ch. 4]. 
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excluded middle [...], as proposed by Brouwer, but rejected by Birkhoff and von 
Neumann." [Pop 



This conclusion is based on the following consideration. It is not hard to find two statements 
A and B (both of the form [A G A]) such that in certain states one has 

P(B A (A V -.A)) = P(B) > = P(B A A) + P(B A -.A) = P((B A A) V (B A -A)). (226) 

The right-hand side of this inequality may be interpreted as stating that B is incompatible 
both with A and -iA. On the other hand, P(B) > can be interpreted to imply that B is not 
an absurdity and therefore is a third possibility (besides the possibilities A and -iA). 

From this point of view, it does indeed seem strange that A V-iA is associated with triviality. 
From an intuitionistic point of view, the peculiarity doesn't seem to arise from the definition 
of negation. This deserves some explanation. In the BHK interpretation each mathematical 
proposition is associated with a project, namely, that of finding a proof. In particular, Kol- 
mogorov associated propositions with problems of which it is the task of the mathematician 
to solve them ( |Kol2] see also |Coq| ). Indeed, the negation of a proposition is not just simply 
the absence of a proof, but actually requires a proof on its own. The notion of proof, however, 
is quite an ambiguous one in physics. Indeed, in mathematics one often considers 'truth' to 
mean 'provability', but in physics the best one can do is falsify a proposition. That is, for 
any acceptable physical proposition one must be able to construct an experiment that may 
have as an outcome that the proposition is not true. The statement [A G A] is falsified if a 
measurement of A yields a result outside A and similarly, —>[A G A] can be falsified if it is 
associated with the statement [A G A c ] (note that this wouldn't be the case if —>[A G A] were 
associated with the set % \/i^(A) 7i). 

On the other hand, in quantum logic the disjunction of two propositions seems to include 
much more than the truth of either one the individual propositions. Indeed, consider the 



situation of equation (226) and consider a state for which B is true. Quantum logic states that 
for this state the proposition AV-iA is true. However, one cannot decide which of the two (A and 
-iA) is true and even worse: assuming either option leads to a contradiction. Thus it seems more 
natural to associate the set ^(A) T~L U/Uyi(A c ) % with the statement [A G A] V-i[y4 G A]. This 
leads to an extension of the set of propositions that not only includes closed linear subspaces 
of the Hilbert space, but at least also incorporates finite unions of closed linear subspaces: 

f n 1 

L x := 1 (J Kj ; n G N, Kj C H is a closed linear subspace Vjf > . (227) 

[j=l J 

This set forms a partially ordered set under inclusion i.e., 

n m n m 

j=i j=i j=i j=i 

The top element T is given by Ti and the bottom element _L is the empty set 0, taking an 
empty inclusion. The bottom element may be identified with the zero-dimensional subspace 
{0}, since the zero vector is not a state. This leads to a small adjustment of the set L\ by 
introducing the equivalence relation 

ii ~ Z 2 <=> hVa U feVi C {0}, h,h€Li, (229) 
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and then taking L\j ~. I will treat L\ as if it is in fact L\/ ~. A disjunction and a conjunction 
can be defined as follows: 

n m n m 

U K i V U K 'j = U {J(KjUK' k ) G L l5 

j=i k=i i=ife=i 

230 

n m n m \ j 

(J A(J = {J (JiKj-nKi) e 

j=l k=l j=l k=l 

The definition of the disjunction is a straight forward generalization of the ideas that led to 
the definition of L±. The definition of the conjunction is inspired by its use in quantum logic. 
It is easy to see that these definitions in fact coincide with the join and the meet that turn L\ 
into a lattice. 

In L\, the definition of negation generalizes to 

n j n I 

^\jK j :=UeH;(il>,<l>) = 0,V<l>e\J K 3 \ , (231) 

3=1 [ J'=l J 

which is a closed linear subspace and thus again an element of L\. This negation has some 
interesting properties. For example, for any element Z 6 L\, its double negation — i — »Z is the 
smallest closed linear subspace that contains I. Consequently, the closed linear subspaces are 
precisely the regular elements of L\ i.e., the elements for which the relation — i — = I holds. 
The negation also behaves non-classically, since one has 

Z V ^ T and V -.-.Z 7^ T, Ml e Li\{T, _L}. (232) 

However, this negation is not a pseudo-complement (which is required for L\ to be a Heyting 
algebra), since although one does have Z A — >Z = J_ for all Z € L\, -iZ is not the greatest element 
that satisfies this property. In fact, there is no greatest element in L\ that satisfies this 
criterion for any I € Li\{T, _L}, and a pseudo-complement can therefore not be defined at all. 
To see this, consider the two-dimensional Hilbert space with orthonormal basis ei,e2- Then 
any element I G L\ that satisfies P ei H ^ Z satisfies P ei TiAl = _L, but the supremum over all 
these elements does not exist (unless one also allows arbitrary unions in L%, in which case the 
pseudo-complement simply becomes the set-theoretic complement). From the impossibility to 
define a pseudo-complement it does not only follow that L\ cannot be given the structure of 
a Heyting algebra, but also that it cannot conform to a broader class of algebras (such as the 
pseudo-complemented lattices) . 

In conclusion, one cannot embed the quantum-logical structure in a Heyting algebra simply 
by redefining conjunction. But redefining negation is also not an option. Indeed, it is easy to 
show that one cannot define an implication on the lattice of projections with disjunction and 
conjunction defined as in Section [5. 3| To see this, consider again the two-dimensional Hilbert 
space and let / := ^\/2(ei + e-i). The proposition Pj — > P ei should be the supremum of all 
elements that satisfy P A Pf < P ei ■ Thus Pf —> P ei should be the supremum over all P that 
satisfy P ei ^ P, which does not exist. These considerations exclude every acceptable Heyting 
algebra in terms of subsets of the Hilbert space. 
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partial order. Both options are not likely to result in any structure that also allows an interpretation. 
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A way to break out of the structure of subsets of the Hilbert space may be found when 
looking once more at the particular proposition [A £ A] V ^[A £ A], where —>[A £ A] is 
interpreted as [A £ A c ]. According to quantum mechanics, a measurement of A will with 
certainty yield a result in the set AU A c . Does this also mean that one can say that the result 
lies either in A or in A c ? From an intuitionistic point of view the answer is not clear, since 
for an intuitionist the equality A U A c = K does not hold in general. There may be numbers 
i el for which one cannot make this decision. However, in practice one may imagine that 
the set A is well-behaved and the set of all numbers i£R for which one cannot decide x £ A 
or x £ A c is most likely to be assigned probability zero in any quantum-mechanical state. 
What I want to get at, is that in the event of an actual measurement of A, it is natural to 
assume that [A £ A] V —>[A £ A] is true. However, in the case of a measurement of an other 
observable, the proposition [A £ A] V —>[A 6 A] is not likely to make any sense. What may be 
considered true and what may not, may depend on the measurement context. The information 
about measuring contexts is thrown away when propositions are identified with closed linear 
subspaces, and this may be seen as an explanation of why it is not possible to unambiguously 
associate propositions with subsets of the Hilbert space from an intuitionistic point of view. 

It was shown in [CHLSJ that if one adopts a richer structure that also accounts for measur- 
ing contexts, it is possible to construct a Heyting algebra to obtain an intuitionistic quantum 
logic for systems associated with a finite-dimensional Hilbert space. Their results are obtained 
by looking at quantum mechanics from a topos-theoretic point of view. This may be seen as a 
difficult way of taking the easy way out. The easy part is that Heyting algebras arise naturally 
in topos theory. The difficult part is that the approach is very abstract from a mathematical 
point of view. It is not easy to find physical interpretations for the results obtained in this 
way, which is also my main concern with the Heyting algebra obtained in [CHLSJ. To discuss 
these concerns, I'll first introduce the Heyting algebra under consideration. 

Each finite-dimensional Hilbert space over C is isomorphic with C n . The set of operators 
then corresponds with the set of all n x n matrices 21 = M n (C). Measuring contexts are 
associated with Abelian unital sub-C*- algebras of 21. The motivation for this is based on 



the FUNC rule of Section |2.3.3 For any observable A and any Borel function /, one can 
introduce the observable f(A) by applying / to the result of a measurement of A. So in any 
measurement context in which one can measure A, one can also measure all observables of 
the form f(A). Then, if A is associated with the operator A, one may associate f(A) with 
f(A). So the measurement context in which one can measure A may be associated with the 
set {f(A) ; f Borel}. These are precisely all the self-adjoint operators in the smallest unital 
sub-C*-algebra that contains A. This algebra is automatically Abelian. Furthermore, all 
Abelian unital sub-C *-algebr as of 21 are of this form. The set of all these sub-algebras will be 
denoted by C(2l). 

As argued earlier, it is a problem in quantum logic that propositions are specified without 
referring to a measurement context. For example, the statement [A £ A] V —>[A £ A] may be 
considered to be true in any context in which one can measure A and not true in other contexts. 
The question is, of course, how these notions should be linked together. One suggestion is to 
associate propositions with functions 

S : C(2l) -> 7? (21), (233) 



where 7^(21) is the set of all projection operators in 21. The idea is that for each measuring 
context a function S specifies what is actually being measured in that context. In |CHLS] . 
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the following restrictions are derived for these functions: 

L 2 := {S : C(2l) -> V(QL) ; 5(C) G P(C), 5(C) < 5(2?) if C C X>, VC, 2? G C(2l)}, (234) 

where the partial order refers to the one from quantum logic. This set becomes a bounded 
lattice under the definition 

Si < 5 2 5i(C) < 5 2 (C), VC E C(2l), (235a) 

such that its top and bottom element are given by 

T(C) = 1, 1(C) = 0, VC E C(2l), (235b) 

and the join and meet are given by 

(Si V 5 2 )(C) = 5i(C) V 5 2 (C), VC € C(2l); (235c) 

(Si A 5 2 )(C) = 5i(C) A 5 2 (C), VC G C(2l). (235d) 

Note that the lattice operations on the right-hand side are those from quantum logic. As it 
turns out, with the definition of implication given by 

(5i -> 5 2 )(C) := \J{P G V{C) ; P < SxiV) 1 V S 2 (V), VD D C}, (235e) 

the lattice becomes a Heyting algebra, which in general is non-Boolean (i.e., if one defines 
—>S := 5 — > _L, there are elements 5 for which SV—>S ^ T). From a mathematical point of view 
this is enough to speak of an intuitionistic logic of M n (C). However, without an interpretation 
this logic is meaningless from a physical point of view. In [CHLSJ, any indication for a 
possible interpretation is missing. The question therefore becomes: "What does a proposition 



5 actually state?" And: "What is the significance of the restrictions in (234)?" The idea 
behind the first restriction may be that statements in a certain measuring context should 
make sense in that specific measuring context. The idea behind the second condition may be 
that in a larger measuring context more may be true. But these notions are, of course, a bit 
vague and so it may be better to investigate them in an explicit example. 

Consider the system of a single spin-^ particle with associated Hilbert space C 2 . More 
specifically, consider a measurement of the spin along the z-axis. The associated operator and 
corresponding minimal measuring context are given by 

'•-(i -i). c - = {(o 6 c }- (236) 

The algebra C z is the only measuring context in which o z can be measured and the only mea- 
suring context properly contained in this one is CI. Now consider the proposition [P z + = 1], 
which is generally read as "the spin along the z-axis is up" and which in quantum logic is asso- 
ciated with the projection operator P z +. It is a fundamental proposition, since it may be the 
outcome of a measurement; any physical theory should at least incorporate such propositions. 
So it becomes the task to associate an element Srp z+=1 ] of L with this proposition. For its 
value in C z two possibilities come to mind: 



5[ Pz+=1] (C 2 ) = P z + or S[ Px+= i](C z ) = 1 



(237) 
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The second option immediately leads to problems because it would imply S\p z _ = i](C z ) = 1, 
which consequently results in Sm z+= i] = 5rp z _ =1 i. The first option implies S[p z+=1 ](C1) = © 
because Stp z + = i\ must be an element of L. Thus the proposition 5p z _|_ = i] seems to state 
that in the measurement context in which one can only measure trivialities (CI), one finds 
a contradiction (which is the standard interpretation of the zero operator). This seems very 
puzzling to me to say the least, and I think the problem arises from the restriction 5(C) < S(D) 
whenever C C P. 

Consider the following interpretation of a function 5 : C(2l) — > P(2l): given the proposition 
5, then 5(C) gives all the available information that is relevant for predictions concerning 
measurements in the measuring context C. Now consider the given information [P z + = 1]. An 
associated proposition 5rp z _ ) _ = ij should state that in the measuring context C z one will only 
find results compatible with [P z + = 1] i.e., 5jp z+= i](C 2 ) should be equal to P z +. However, 
in the measuring context CI, the information [P z + = 1] is completely useless; it is as good 
as no information at all. Without information one can only hang on to trivialities, which 
may best be associated with the projection 1. Opposite to trivialities, a contradiction may be 
interpreted as an excess of information, associated with the projection 0. Indeed, the most 
natural contradiction arises from A A -iA for any proposition A. Note that this line of reasoning 
may be applied to any measuring context that doesn't contain o z . Therefore, it seems natural 
to define 

% + =i]( C ) : = jf + ' VCGC(2t), (238) 

which is no longer an element of L2, since it violates the second condition. 
From this point of view, it seems natural to introduce the set 

L 3 := {S : C(2t) -»• P(2l) ; 5(C) e P(C), 5(C) < 5(2?) if C D V, VC, V e C(2l)}. (239) 

The new restriction, i.e. 5(C) < S(T>) whenever DcC, may be motivated by the interpreta- 
tion that the available information 5 can be further specified in a broader measuring context. 
The quantum prepositional lattice V(T-L) may be embedded into this set by taking 

S P (C):=r> P^C yC ^ C ^)^ P ^ V ^)- ( 24 °) 



This embedding may also be used to motivate the operations (235a), (235b), (235c) and 



(235d) to turn L3 into a lattice. 109 Information S\ may be considered to be more precise 
then 52 (5i < 52) if and only if the information is more precise in every measuring context 
(5i(C) < 52(C) VC). The top element corresponds with no information at all (T = St) and 
the bottom element corresponds with contradictory information (_L = 5q). Now consider 
the information Sp 1 or 5p 2 . In a measurement context containing both Pi and P2, one can 
then conclude that Pi V P2 in this context. However, in any other context, one can draw no 



conclusion at all, since one doesn't know which information Pi or P2 is true 110 One then 



109 It is easy to see that L3 with these operations is in fact a bounded lattice, and I will omit the proof here. 

110 This is a peculiarity even in the BHK interpretation. To prove A V B one must prove at least one of the 
statements. Thus having obtained a proof of A V B, one has obtained a proof of A or a proof of B. However, 
once the conclusion is converted from either A is true to A V B is true, or from B is true to A V B is true, one 
cannot recover which of the two is true from the proposition A V B. This is only recovered by looking at the 
proof of A V B, in which case one would rather have one of the propositions A, B or A A B. The same reasoning 
may be applied when thinking of Sp 1 V S P2 . 
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finds (Sp 1 V Sp 2 )(C) = Sp 1 (C) V Sp 2 (C) for all C. This may be compared to the proposition 
Sp 1 \/p 2 , which contains more information from this perspective, since Sp 1 yp 2 < Sp 1 V Sp 2 . 
Finally, consider the information Sp 1 and Sp 2 . In a measurement context containing both 
Pi and Pa, both pieces of information can be applied to obtain Pi A Pa- In a measurement 
context containing only one of the Pi, Pa, only one part of information is applicable and one 
may only obtain the relevant Pi or P3. One therefore finds (Sp 1 A Sp 2 )(C) = Sp 1 (C) A Sp 2 (C) 
for all C. 

The lattice L3 is turned into a Heyting algebra by taking 

(Si -> S 2 )(C) := foS^V) 1 - V S 2 (V) ;VcC}. (241) 

Indeed, for each C, the maximal element of V(C) that can be assigned to Si — > S2 such that 
(Si -> 5 2 )(C) A S X (C) < S 2 (C) is Si(C)- 1 V S 2 (C). However, since Si -> S 2 also has to satisfy 
(Si —7- S 2 )(C) < (Si —7- Sa)(£>) whenever P C C, the maximal element that can be assigned 



to (Si — > Sa)(C) such that Si — > S 2 G £3 is precisely the one given in (241). This also 
shows that L3 is indeed a Heyting algebra. The negation is then defined in the usual way by 
—>S := S — > _L. This negation has some interesting features. Suppose S(C1) = 1. Then, by 
definition, 

--5(C 1) = (S -»• J_)(C 1) = S(C l) x V1(C1) = 0. (242) 
This implies that —>S = _L. On the other hand, if S(C 1) = O, then S = _L and one has 

--5(C 1) =(S ->• J_)(C 1) = A {S(P) X V J_(X>) ;VcC} 

(243) 

= /\{O i VO;PcC} = l. 

In conclusion 

-S=| T ' S = ±] VSGL 3 . (244) 



So the only regular elements of L3 are _L and T, which makes L3 extremely non-Boolean. 

It seems to me that the Heyting algebra L3 comes closer to describing an intuitionistic logic 
for M n (C) then L 2 . But there is still a lot of work to be done if intuitionistic logic is to play an 
explicit role in the interpretation of quantum mechanics and it is not clear if L3 is a step in the 
right direction. A significant feature of quantum mechanics is that propositions about future 
measurements can be assigned a probability of turning out to be true in the case of actual 
measurements. It is not clear what role these probabilities should play in the intuitionistic 
approach. Moreover, there is no generally accepted theory of intuitionistic probability, though 
there have been some interesting suggestions (see for example |vFlj . |ML1| . |ML2j and |Weaj ). 
It is an open question if any of these suggestions can be applied to L3 in a way that is consistent 
with the interpretation of the propositions. 

A question that may play an important role is: "What is the meaning of these probabilities 
in the Copenhagen interpretation?" In the realist approach probabilities arise from lack of 
information about the actual state of the system, but from the Copenhagen point of view, the 
state of the system may be considered to be nothing but these probabilities. The information- 
theoretic approach I have used to discuss some problems in this thesis seems to advocate a 
Bayesian approach, in which the probability of an event expresses the degree of faith someone 
(usually called an agent in this context) has in that the event will actually occur based on the 
information available to that person. 
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Investigations in a Bayesian approach to quantum mechanics are being carried out ([FSJ 
is one of the latest works and also refers to a lot of earlier work). One of the obstacles is to 
find a philosophical motivation for the Born axiom or rather an answer to the question: "Why 
are the only admissible probabilities those that obey the structure postulated by the Born 
axiom?" This is indeed a natural question. In orthodox quantum mechanics the probabilities 
are simply derived from the objective quantum state. But, as argued at the end of Section 



5.2 the objective state does not make much sense from an information theoretic point of 
view. A difficulty that arises when trying to find an answer to the above question is that the 
Born axiom allows one to have connections between probabilities that apply to counterfactual 
situations, whereas the Copenhagen interpretation advocates the idea that one should not 
compare counterfactual situations. It seems to me quite possible that these difficulties may 
have more natural solutions if one adopts an intuitionistic point of view, since the intuitionistic 
point of view provides a way of dealing with counterfactual propositions, as shown in Section 
PI 

The final problem that I want to discuss, which arises if one wants to adopt an intuitionist 
point of view on quantum mechanics, is the following. The theory of Hilbert spaces, being the 
mathematical foundation of quantum mechanics, is based on classical mathematics and relies 
on theorems that may not be true from an intuitionistic point of view. Attempts have been 
made to find a constructive foundation for quantum mechanics in |Bri| . |RB| and [BSl] but 
it has never come so far as to find a constructive approach in which the axioms of quantum 
mechanics can be reformulated (at least, not as far as I know). However, one may also raise the 
question to what extent the Hilbert space formalism is truly necessary for quantum mechanics. 
Approaches have become more algebraic over the years and it may be possible to obtain a 
formulation in which it is easier to obtain an intuitionistic mathematical approach. But the 
problem may also find an other resolution. One may have an intuitionistic point of view 
towards physics while maintaining a classical point of view towards mathematics. However, 
it is a deep philosophical question if this view can be self-consistent. 

5.6 Epilogue 

The ideas expressed in Chapter [5] may seem premature and perhaps even a bit farfetched. 
Although I do believe that intuitionistic ideas may play a possible role in understanding 
quantum mechanics, I should stress that I don't think that their role will ever be strictly 
necessary; this depends on what one expects from a physical theory. 

In [Bel5j, Bell expressed his dislike for the standard interpretation of quantum mechanics 



(which is roughly quantum mechanics as presented by the six postulates in Section 2.1 ). For 
example, he states that 

"the idea that quantum mechanics [. . . ] is exclusively about the results of experi- 
ments would remain disappointing." [Bel5, p. 34] 

But, like many physicists, Bell not only opposes the instrumentalist view on science, but he 
also advocates that a physical theory should actually be about Nature itself rather than about 
merely our observations of Nature. In other words, Bell argues that a satisfactory realist 
interpretation should be a necessary criterion for any physical theory. 

The impossibility proofs discussed in this thesis are taken to imply some unsatisfactory 
features every realist interpretation of quantum mechanics necessarily possesses. However, 
none of these proofs seem loophole-free to me and I don't think they can be. The reason 
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is quite simple: one never knows for sure if one has considered all possible loopholes. The 
discovery of the 'finite precision'-loophole in the Kochen-Specker Theorem, which leads to the 
MKC-models, is a striking example, in my opinion. 

No matter what exactly is proven by any of the impossibility proofs, it has been estab- 
lished that a realist interpretation is not possible without adding, removing or modifying 
any of the postulates. Indeed, both the von Neumann proof and the Kochen-Specker Theo- 
rem imply that any realist interpretation will encounter difficulties incorporating the notion 



of non-commutativity of observables. 111 whereas the Bell-type arguments and the Free Will 
Theorem show that any realist interpretation will meet difficulties incorporating the notion of 
entanglement. 

It is a peculiar aspect of quantum mechanics that the choice between realism and instru- 
mentalism seems to strongly influence the way the theory should be formulated. But it may 
be even more peculiar thatmost of the realist interpretations known to date (the best-known 
probably being Bohmian mechanics, the GRW theory and Everett's many worlds interpreta- 
tion) describe a significantly different reality. This may cast even more doubt on the realist 
view than the introduction of non-locality. The most noticeabe advantage some of the present 
realist interpretations of quantum mechanics have is that they are easy to understand; they 
state what is actually being measured when a measurement is performed, and sometimes even 
describe what constitutes a measurement. In other words, they only seem compelling due to 
the vagueness and difficulty of the standard interpretation. 

But when comprehension rather than truth becomes the decisive criterion for physical 
theories, there is no reason to adopt a realist view per se. Comprehension may also be found 
using other approaches, and Bohr's views seem a good starting point to me. The Copenhagen 



interpretation is a departure from the materialist paradise upheld by classical physics 112 The 
emphasis is on the subject (the experimenter) and how the subject interacts with the object 
(through experiments) without making any statements about the object by itself. In a certain 
sense, it is an anti-Copernican revolution; the Copenhagen interpretation puts the observer 
back into the center of the universe. 

It should be noted that the Bohrian approach is not really an instrumentalist approach. 
A true instrumentalist would be content with just the mathematical formulation of a theory 
along with some rules how to apply it. In fact, I think Bohr would agree with Bell that there's 
more to physics than just measurement results. It is the aim to acquire some understanding 
of Nature. The difference is that realists start by making assumptions about Nature itself 
whereas Bohr starts from the point of view of the observer, accepting the (likely) possibility 
that in this approach one may never come to a complete understanding of Nature itself. 

This is roughly the point of view I had when I first came up with the idea of a possible role 
for intuitionistic logic in quantum mechanics. It was only later that I found some resemblances 



with the ideas expressed by Bohr In a realist interpretation the law of excluded middle 



ln It is usually understood to imply the notion of contextuality but as discussed in this thesis, this conclusion 
is not a necessary one. In the MKC-models, non-commutativity leads to statistical independence (see Corollary 



3.6 1. Either way, it seems unlikely that the notion of non-commutativity will vanish completely in a realist 



interpretation. 

112 This departure somewhat resembles the departure from Cantor's paradise that took place in mathematics 
from around 1900. Hilbert would have nothing of it: "Aus dem Paradies, das Cantor uns geschaffen, soil uns 
niemand vertreiben konnen." [Hilll p. 170]. And many mathematicians still agree with this view to some 
extent. 

113 Although Bohr stressed the necessity of the use of classical logic in physics: "all well-defined experimental 
data [. . . ] must be expressed in ordinary language making use of common logic." |Boh4l p. 317]. 
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seems necessary; every statement about something that is, is either true or false. But from the 
point of view of the observer this is not obvious. For example, a statement like 'the particle 
went either through slit one or slit two' has no significant meaning if one does not perform an 
experiment that decides through which slit the particle went. 

Intuitionistic logic also seems the natural logic that is used when deriving theories from 
experimental data, since such theories naturally do not talk about unperformed measurements. 
From this point of view it may not be all that surprising that a theory (i.e. quantum mechanics) 
could have emerged at the beginning of the twentieth century that seemed to defy the notions 
of logic itself, since the methods used to construct the theory use a logic that differs from the 
one demanded for the theory itself. 
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